[all-commits] [llvm/llvm-project] 827f2a: AMDGPU: Convert vector 64-bit shl to 32-bit if shi...
Fangrui Song via All-commits
all-commits at lists.llvm.org
Fri Mar 28 11:39:58 PDT 2025
Branch: refs/heads/users/MaskRay/spr/riscvasmparser-dont-treat-operands-with-relocation-specifier-as-parse-time-constants
Home: https://github.com/llvm/llvm-project
Commit: 827f2ad64394bdcafabc58133c4b8258f728763c
https://github.com/llvm/llvm-project/commit/827f2ad64394bdcafabc58133c4b8258f728763c
Author: LU-JOHN <John.Lu at amd.com>
Date: 2025-03-28 (Fri, 28 Mar 2025)
Changed paths:
M llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
M llvm/test/CodeGen/AMDGPU/shl64_reduce.ll
Log Message:
-----------
AMDGPU: Convert vector 64-bit shl to 32-bit if shift amt >= 32 (#132964)
Convert vector 64-bit shl to 32-bit if shift amt is known to be >= 32.
---------
Signed-off-by: John Lu <John.Lu at amd.com>
Commit: 44e3735ac18407f23a4e1fea29192449dbe3d4de
https://github.com/llvm/llvm-project/commit/44e3735ac18407f23a4e1fea29192449dbe3d4de
Author: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: 2025-03-28 (Fri, 28 Mar 2025)
Changed paths:
A llvm/test/tools/llvm-reduce/remove-arguments-preserve-wrong-cc.ll
M llvm/tools/llvm-reduce/deltas/ReduceArguments.cpp
Log Message:
-----------
llvm-reduce: Preserve original callsite calling conv when removing arguments (#133411)
In undefined mismatch cases, this was fixing the callsite to use the calling
convention of the new function. Preserve the original wrong callsite's calling
convention.
Commit: 133c1afa8e54d4e9599cb0d8aad843c03f0e63b9
https://github.com/llvm/llvm-project/commit/133c1afa8e54d4e9599cb0d8aad843c03f0e63b9
Author: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: 2025-03-29 (Sat, 29 Mar 2025)
Changed paths:
A llvm/test/tools/llvm-reduce/reduce-arguments-invoke.ll
A llvm/test/tools/llvm-reduce/reduce-arguments-non-callee-use.ll
M llvm/tools/llvm-reduce/deltas/ReduceArguments.cpp
Log Message:
-----------
llvm-reduce: Filter function based on uses before removing arguments (#133412)
Invokes and others are not handled, so this was leaving broken callsites
behind for anything other than CallInst
Commit: 049f179606a4af3ea650d7049626d267e01b79e2
https://github.com/llvm/llvm-project/commit/049f179606a4af3ea650d7049626d267e01b79e2
Author: Tim Gymnich <tim at gymni.ch>
Date: 2025-03-28 (Fri, 28 Mar 2025)
Changed paths:
M llvm/include/llvm/Analysis/ValueTracking.h
A llvm/include/llvm/Support/KnownFPClass.h
M llvm/lib/Analysis/InstructionSimplify.cpp
M llvm/lib/Analysis/ValueTracking.cpp
M llvm/lib/Support/CMakeLists.txt
A llvm/lib/Support/KnownFPClass.cpp
M llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
M llvm/lib/Transforms/IPO/AttributorAttributes.cpp
M llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
M llvm/lib/Transforms/InstCombine/InstCombineInternal.h
M llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
M llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
M llvm/unittests/Analysis/ValueTrackingTest.cpp
M llvm/utils/gn/secondary/llvm/lib/Support/BUILD.gn
Log Message:
-----------
[Analysis][NFC] Extract KnownFPClass (#133457)
- extract KnownFPClass for future use inside of GISelKnownBits
---------
Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>
Commit: 8c7550132f410766598ba88de6b7e7a2a4607f67
https://github.com/llvm/llvm-project/commit/8c7550132f410766598ba88de6b7e7a2a4607f67
Author: Ana Mihajlovic <Ana.Mihajlovic at amd.com>
Date: 2025-03-28 (Fri, 28 Mar 2025)
Changed paths:
M llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
M llvm/test/CodeGen/AMDGPU/GlobalISel/addsubu64.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_udec_wrap.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_uinc_wrap.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement-stack-lower.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i128.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i16.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i8.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.global.atomic.csub.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/mubuf-global.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/shl-ext-reduce.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll
M llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
M llvm/test/CodeGen/AMDGPU/bf16.ll
M llvm/test/CodeGen/AMDGPU/carryout-selection.ll
M llvm/test/CodeGen/AMDGPU/cgp-addressing-modes-flat.ll
M llvm/test/CodeGen/AMDGPU/div-rem-by-constant-64.ll
M llvm/test/CodeGen/AMDGPU/dpp64_combine.ll
M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fadd.ll
M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmax.ll
M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmin.ll
M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fsub.ll
M llvm/test/CodeGen/AMDGPU/flat_atomics_i64.ll
M llvm/test/CodeGen/AMDGPU/fneg-combines.f16.ll
M llvm/test/CodeGen/AMDGPU/fold-gep-offset.ll
M llvm/test/CodeGen/AMDGPU/fold-int-pow2-with-fmul-or-fdiv.ll
M llvm/test/CodeGen/AMDGPU/gfx10-vop-literal.ll
M llvm/test/CodeGen/AMDGPU/gfx12_scalar_subword_loads.ll
M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll
M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmax.ll
M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmin.ll
M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fsub.ll
M llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
M llvm/test/CodeGen/AMDGPU/idiv-licm.ll
M llvm/test/CodeGen/AMDGPU/integer-mad-patterns.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.intersect_ray.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.prefetch.data.ll
M llvm/test/CodeGen/AMDGPU/llvm.mulo.ll
M llvm/test/CodeGen/AMDGPU/load-constant-always-uniform.ll
M llvm/test/CodeGen/AMDGPU/lrint.ll
M llvm/test/CodeGen/AMDGPU/lround.ll
M llvm/test/CodeGen/AMDGPU/machine-sink-temporal-divergence-swdev407790.ll
M llvm/test/CodeGen/AMDGPU/mad_64_32.ll
M llvm/test/CodeGen/AMDGPU/match-perm-extract-vector-elt-bug.ll
M llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
M llvm/test/CodeGen/AMDGPU/memmove-var-size.ll
M llvm/test/CodeGen/AMDGPU/mul.ll
M llvm/test/CodeGen/AMDGPU/offset-split-flat.ll
M llvm/test/CodeGen/AMDGPU/offset-split-global.ll
M llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
M llvm/test/CodeGen/AMDGPU/reassoc-mul-add-1-to-mad.ll
M llvm/test/CodeGen/AMDGPU/saddo.ll
M llvm/test/CodeGen/AMDGPU/saddsat.ll
M llvm/test/CodeGen/AMDGPU/shl_add_ptr_csub.ll
M llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll
M llvm/test/CodeGen/AMDGPU/ssubsat.ll
M llvm/test/CodeGen/AMDGPU/sub.ll
M llvm/test/CodeGen/AMDGPU/uaddsat.ll
M llvm/test/CodeGen/AMDGPU/udiv.ll
M llvm/test/CodeGen/AMDGPU/usubsat.ll
M llvm/test/CodeGen/AMDGPU/vector-reduce-add.ll
M llvm/test/CodeGen/AMDGPU/vgpr-mark-last-scratch-load.ll
Log Message:
-----------
[AMDGPU] Unused sdst writing to null (#133229)
Unused sdst writing to null to avoid a false VALU->SALU dependency
stall. This requires using the VOP3 encoding.
Commit: 4c4e4e4299b16d8dd85811e9dd8697b17c95577f
https://github.com/llvm/llvm-project/commit/4c4e4e4299b16d8dd85811e9dd8697b17c95577f
Author: Ramkumar Ramachandra <ramkumar.ramachandra at codasip.com>
Date: 2025-03-28 (Fri, 28 Mar 2025)
Changed paths:
M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Log Message:
-----------
[LV] Strengthen calls to collectInstsToScalarize (NFC) (#130642)
Avoid the pattern of always calling collectInstsToScalarize after
collectUniformsAndScalars, and call it in collectUniformsAndScalars
instead. Also strengthen checks for early exits in the function.
Commit: b8c86dc7699d3c561bcbee143d2022561d5cc8a3
https://github.com/llvm/llvm-project/commit/b8c86dc7699d3c561bcbee143d2022561d5cc8a3
Author: Philip Reames <preames at rivosinc.com>
Date: 2025-03-28 (Fri, 28 Mar 2025)
Changed paths:
M llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
M llvm/lib/Target/SystemZ/SystemZInstrInfo.h
Log Message:
-----------
[SystemZ] Remove custom implementation of optimizeLoadInst [NFC] (#133123)
In 236f938ef, I introduced a generic version of this routine. I believe
that the SystemZ specific version of this is less general than the
generic version, and is thus unrequired. I wasn't 100% given the
difference in sub-register, multiple use and defs, but from the SystemZ
code, it looks like those cases simply don't arise?
Commit: 4aa5e9b722c7d27130b7e4453d221d5a057474a7
https://github.com/llvm/llvm-project/commit/4aa5e9b722c7d27130b7e4453d221d5a057474a7
Author: Fangrui Song <i at maskray.me>
Date: 2025-03-28 (Fri, 28 Mar 2025)
Changed paths:
M llvm/include/llvm/Analysis/ValueTracking.h
A llvm/include/llvm/Support/KnownFPClass.h
M llvm/lib/Analysis/InstructionSimplify.cpp
M llvm/lib/Analysis/ValueTracking.cpp
M llvm/lib/Support/CMakeLists.txt
A llvm/lib/Support/KnownFPClass.cpp
M llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
M llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
M llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
M llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
M llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
M llvm/lib/Target/SystemZ/SystemZInstrInfo.h
M llvm/lib/Transforms/IPO/AttributorAttributes.cpp
M llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
M llvm/lib/Transforms/InstCombine/InstCombineInternal.h
M llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
M llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
M llvm/test/CodeGen/AMDGPU/GlobalISel/addsubu64.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_udec_wrap.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_uinc_wrap.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement-stack-lower.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i128.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i16.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i8.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.global.atomic.csub.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/mubuf-global.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/shl-ext-reduce.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll
M llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
M llvm/test/CodeGen/AMDGPU/bf16.ll
M llvm/test/CodeGen/AMDGPU/carryout-selection.ll
M llvm/test/CodeGen/AMDGPU/cgp-addressing-modes-flat.ll
M llvm/test/CodeGen/AMDGPU/div-rem-by-constant-64.ll
M llvm/test/CodeGen/AMDGPU/dpp64_combine.ll
M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fadd.ll
M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmax.ll
M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmin.ll
M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fsub.ll
M llvm/test/CodeGen/AMDGPU/flat_atomics_i64.ll
M llvm/test/CodeGen/AMDGPU/fneg-combines.f16.ll
M llvm/test/CodeGen/AMDGPU/fold-gep-offset.ll
M llvm/test/CodeGen/AMDGPU/fold-int-pow2-with-fmul-or-fdiv.ll
M llvm/test/CodeGen/AMDGPU/gfx10-vop-literal.ll
M llvm/test/CodeGen/AMDGPU/gfx12_scalar_subword_loads.ll
M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll
M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmax.ll
M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmin.ll
M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fsub.ll
M llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
M llvm/test/CodeGen/AMDGPU/idiv-licm.ll
M llvm/test/CodeGen/AMDGPU/integer-mad-patterns.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.intersect_ray.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.prefetch.data.ll
M llvm/test/CodeGen/AMDGPU/llvm.mulo.ll
M llvm/test/CodeGen/AMDGPU/load-constant-always-uniform.ll
M llvm/test/CodeGen/AMDGPU/lrint.ll
M llvm/test/CodeGen/AMDGPU/lround.ll
M llvm/test/CodeGen/AMDGPU/machine-sink-temporal-divergence-swdev407790.ll
M llvm/test/CodeGen/AMDGPU/mad_64_32.ll
M llvm/test/CodeGen/AMDGPU/match-perm-extract-vector-elt-bug.ll
M llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
M llvm/test/CodeGen/AMDGPU/memmove-var-size.ll
M llvm/test/CodeGen/AMDGPU/mul.ll
M llvm/test/CodeGen/AMDGPU/offset-split-flat.ll
M llvm/test/CodeGen/AMDGPU/offset-split-global.ll
M llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
M llvm/test/CodeGen/AMDGPU/reassoc-mul-add-1-to-mad.ll
M llvm/test/CodeGen/AMDGPU/saddo.ll
M llvm/test/CodeGen/AMDGPU/saddsat.ll
M llvm/test/CodeGen/AMDGPU/shl64_reduce.ll
M llvm/test/CodeGen/AMDGPU/shl_add_ptr_csub.ll
M llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll
M llvm/test/CodeGen/AMDGPU/ssubsat.ll
M llvm/test/CodeGen/AMDGPU/sub.ll
M llvm/test/CodeGen/AMDGPU/uaddsat.ll
M llvm/test/CodeGen/AMDGPU/udiv.ll
M llvm/test/CodeGen/AMDGPU/usubsat.ll
M llvm/test/CodeGen/AMDGPU/vector-reduce-add.ll
M llvm/test/CodeGen/AMDGPU/vgpr-mark-last-scratch-load.ll
M llvm/test/MC/RISCV/rvi-valid.s
A llvm/test/tools/llvm-reduce/reduce-arguments-invoke.ll
A llvm/test/tools/llvm-reduce/reduce-arguments-non-callee-use.ll
A llvm/test/tools/llvm-reduce/remove-arguments-preserve-wrong-cc.ll
M llvm/tools/llvm-reduce/deltas/ReduceArguments.cpp
M llvm/unittests/Analysis/ValueTrackingTest.cpp
M llvm/utils/gn/secondary/llvm/lib/Support/BUILD.gn
Log Message:
-----------
Simplify simm12 and uimm20 (lui,auipc)
Created using spr 1.3.5-bogner
Compare: https://github.com/llvm/llvm-project/compare/35830c252b11...4aa5e9b722c7
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list