[all-commits] [llvm/llvm-project] 827f2a: AMDGPU: Convert vector 64-bit shl to 32-bit if shi...

Fangrui Song via All-commits all-commits at lists.llvm.org
Fri Mar 28 11:39:58 PDT 2025


  Branch: refs/heads/users/MaskRay/spr/riscvasmparser-dont-treat-operands-with-relocation-specifier-as-parse-time-constants
  Home:   https://github.com/llvm/llvm-project
  Commit: 827f2ad64394bdcafabc58133c4b8258f728763c
      https://github.com/llvm/llvm-project/commit/827f2ad64394bdcafabc58133c4b8258f728763c
  Author: LU-JOHN <John.Lu at amd.com>
  Date:   2025-03-28 (Fri, 28 Mar 2025)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
    M llvm/test/CodeGen/AMDGPU/shl64_reduce.ll

  Log Message:
  -----------
  AMDGPU: Convert vector 64-bit shl to 32-bit if shift amt >= 32 (#132964)

Convert vector 64-bit shl to 32-bit if shift amt is known to be >= 32.

---------

Signed-off-by: John Lu <John.Lu at amd.com>


  Commit: 44e3735ac18407f23a4e1fea29192449dbe3d4de
      https://github.com/llvm/llvm-project/commit/44e3735ac18407f23a4e1fea29192449dbe3d4de
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2025-03-28 (Fri, 28 Mar 2025)

  Changed paths:
    A llvm/test/tools/llvm-reduce/remove-arguments-preserve-wrong-cc.ll
    M llvm/tools/llvm-reduce/deltas/ReduceArguments.cpp

  Log Message:
  -----------
  llvm-reduce: Preserve original callsite calling conv when removing arguments (#133411)

In undefined mismatch cases, this was fixing the callsite to use the calling
convention of the new function. Preserve the original wrong callsite's calling
convention.


  Commit: 133c1afa8e54d4e9599cb0d8aad843c03f0e63b9
      https://github.com/llvm/llvm-project/commit/133c1afa8e54d4e9599cb0d8aad843c03f0e63b9
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2025-03-29 (Sat, 29 Mar 2025)

  Changed paths:
    A llvm/test/tools/llvm-reduce/reduce-arguments-invoke.ll
    A llvm/test/tools/llvm-reduce/reduce-arguments-non-callee-use.ll
    M llvm/tools/llvm-reduce/deltas/ReduceArguments.cpp

  Log Message:
  -----------
  llvm-reduce: Filter function based on uses before removing arguments (#133412)

Invokes and others are not handled, so this was leaving broken callsites
behind for anything other than CallInst


  Commit: 049f179606a4af3ea650d7049626d267e01b79e2
      https://github.com/llvm/llvm-project/commit/049f179606a4af3ea650d7049626d267e01b79e2
  Author: Tim Gymnich <tim at gymni.ch>
  Date:   2025-03-28 (Fri, 28 Mar 2025)

  Changed paths:
    M llvm/include/llvm/Analysis/ValueTracking.h
    A llvm/include/llvm/Support/KnownFPClass.h
    M llvm/lib/Analysis/InstructionSimplify.cpp
    M llvm/lib/Analysis/ValueTracking.cpp
    M llvm/lib/Support/CMakeLists.txt
    A llvm/lib/Support/KnownFPClass.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
    M llvm/lib/Transforms/IPO/AttributorAttributes.cpp
    M llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
    M llvm/lib/Transforms/InstCombine/InstCombineInternal.h
    M llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
    M llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
    M llvm/unittests/Analysis/ValueTrackingTest.cpp
    M llvm/utils/gn/secondary/llvm/lib/Support/BUILD.gn

  Log Message:
  -----------
  [Analysis][NFC] Extract KnownFPClass (#133457)

- extract KnownFPClass for future use inside of GISelKnownBits

---------

Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>


  Commit: 8c7550132f410766598ba88de6b7e7a2a4607f67
      https://github.com/llvm/llvm-project/commit/8c7550132f410766598ba88de6b7e7a2a4607f67
  Author: Ana Mihajlovic <Ana.Mihajlovic at amd.com>
  Date:   2025-03-28 (Fri, 28 Mar 2025)

  Changed paths:
    M llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/addsubu64.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_udec_wrap.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_uinc_wrap.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement-stack-lower.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i128.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i8.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.global.atomic.csub.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/mubuf-global.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/shl-ext-reduce.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll
    M llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.ll
    M llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
    M llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
    M llvm/test/CodeGen/AMDGPU/bf16.ll
    M llvm/test/CodeGen/AMDGPU/carryout-selection.ll
    M llvm/test/CodeGen/AMDGPU/cgp-addressing-modes-flat.ll
    M llvm/test/CodeGen/AMDGPU/div-rem-by-constant-64.ll
    M llvm/test/CodeGen/AMDGPU/dpp64_combine.ll
    M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fadd.ll
    M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmax.ll
    M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmin.ll
    M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fsub.ll
    M llvm/test/CodeGen/AMDGPU/flat_atomics_i64.ll
    M llvm/test/CodeGen/AMDGPU/fneg-combines.f16.ll
    M llvm/test/CodeGen/AMDGPU/fold-gep-offset.ll
    M llvm/test/CodeGen/AMDGPU/fold-int-pow2-with-fmul-or-fdiv.ll
    M llvm/test/CodeGen/AMDGPU/gfx10-vop-literal.ll
    M llvm/test/CodeGen/AMDGPU/gfx12_scalar_subword_loads.ll
    M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll
    M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmax.ll
    M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmin.ll
    M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fsub.ll
    M llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
    M llvm/test/CodeGen/AMDGPU/idiv-licm.ll
    M llvm/test/CodeGen/AMDGPU/integer-mad-patterns.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.intersect_ray.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.prefetch.data.ll
    M llvm/test/CodeGen/AMDGPU/llvm.mulo.ll
    M llvm/test/CodeGen/AMDGPU/load-constant-always-uniform.ll
    M llvm/test/CodeGen/AMDGPU/lrint.ll
    M llvm/test/CodeGen/AMDGPU/lround.ll
    M llvm/test/CodeGen/AMDGPU/machine-sink-temporal-divergence-swdev407790.ll
    M llvm/test/CodeGen/AMDGPU/mad_64_32.ll
    M llvm/test/CodeGen/AMDGPU/match-perm-extract-vector-elt-bug.ll
    M llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
    M llvm/test/CodeGen/AMDGPU/memmove-var-size.ll
    M llvm/test/CodeGen/AMDGPU/mul.ll
    M llvm/test/CodeGen/AMDGPU/offset-split-flat.ll
    M llvm/test/CodeGen/AMDGPU/offset-split-global.ll
    M llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
    M llvm/test/CodeGen/AMDGPU/reassoc-mul-add-1-to-mad.ll
    M llvm/test/CodeGen/AMDGPU/saddo.ll
    M llvm/test/CodeGen/AMDGPU/saddsat.ll
    M llvm/test/CodeGen/AMDGPU/shl_add_ptr_csub.ll
    M llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll
    M llvm/test/CodeGen/AMDGPU/ssubsat.ll
    M llvm/test/CodeGen/AMDGPU/sub.ll
    M llvm/test/CodeGen/AMDGPU/uaddsat.ll
    M llvm/test/CodeGen/AMDGPU/udiv.ll
    M llvm/test/CodeGen/AMDGPU/usubsat.ll
    M llvm/test/CodeGen/AMDGPU/vector-reduce-add.ll
    M llvm/test/CodeGen/AMDGPU/vgpr-mark-last-scratch-load.ll

  Log Message:
  -----------
  [AMDGPU] Unused sdst writing to null (#133229)

Unused sdst writing to null to avoid a false VALU->SALU dependency
stall. This requires using the VOP3 encoding.


  Commit: 4c4e4e4299b16d8dd85811e9dd8697b17c95577f
      https://github.com/llvm/llvm-project/commit/4c4e4e4299b16d8dd85811e9dd8697b17c95577f
  Author: Ramkumar Ramachandra <ramkumar.ramachandra at codasip.com>
  Date:   2025-03-28 (Fri, 28 Mar 2025)

  Changed paths:
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

  Log Message:
  -----------
  [LV] Strengthen calls to collectInstsToScalarize (NFC) (#130642)

Avoid the pattern of always calling collectInstsToScalarize after
collectUniformsAndScalars, and call it in collectUniformsAndScalars
instead. Also strengthen checks for early exits in the function.


  Commit: b8c86dc7699d3c561bcbee143d2022561d5cc8a3
      https://github.com/llvm/llvm-project/commit/b8c86dc7699d3c561bcbee143d2022561d5cc8a3
  Author: Philip Reames <preames at rivosinc.com>
  Date:   2025-03-28 (Fri, 28 Mar 2025)

  Changed paths:
    M llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
    M llvm/lib/Target/SystemZ/SystemZInstrInfo.h

  Log Message:
  -----------
  [SystemZ] Remove custom implementation of optimizeLoadInst [NFC] (#133123)

In 236f938ef, I introduced a generic version of this routine. I believe
that the SystemZ specific version of this is less general than the
generic version, and is thus unrequired. I wasn't 100% given the
difference in sub-register, multiple use and defs, but from the SystemZ
code, it looks like those cases simply don't arise?


  Commit: 4aa5e9b722c7d27130b7e4453d221d5a057474a7
      https://github.com/llvm/llvm-project/commit/4aa5e9b722c7d27130b7e4453d221d5a057474a7
  Author: Fangrui Song <i at maskray.me>
  Date:   2025-03-28 (Fri, 28 Mar 2025)

  Changed paths:
    M llvm/include/llvm/Analysis/ValueTracking.h
    A llvm/include/llvm/Support/KnownFPClass.h
    M llvm/lib/Analysis/InstructionSimplify.cpp
    M llvm/lib/Analysis/ValueTracking.cpp
    M llvm/lib/Support/CMakeLists.txt
    A llvm/lib/Support/KnownFPClass.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
    M llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
    M llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
    M llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
    M llvm/lib/Target/SystemZ/SystemZInstrInfo.h
    M llvm/lib/Transforms/IPO/AttributorAttributes.cpp
    M llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
    M llvm/lib/Transforms/InstCombine/InstCombineInternal.h
    M llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
    M llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/addsubu64.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_udec_wrap.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_uinc_wrap.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement-stack-lower.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i128.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i8.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.global.atomic.csub.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/mubuf-global.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/shl-ext-reduce.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll
    M llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.ll
    M llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
    M llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
    M llvm/test/CodeGen/AMDGPU/bf16.ll
    M llvm/test/CodeGen/AMDGPU/carryout-selection.ll
    M llvm/test/CodeGen/AMDGPU/cgp-addressing-modes-flat.ll
    M llvm/test/CodeGen/AMDGPU/div-rem-by-constant-64.ll
    M llvm/test/CodeGen/AMDGPU/dpp64_combine.ll
    M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fadd.ll
    M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmax.ll
    M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmin.ll
    M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fsub.ll
    M llvm/test/CodeGen/AMDGPU/flat_atomics_i64.ll
    M llvm/test/CodeGen/AMDGPU/fneg-combines.f16.ll
    M llvm/test/CodeGen/AMDGPU/fold-gep-offset.ll
    M llvm/test/CodeGen/AMDGPU/fold-int-pow2-with-fmul-or-fdiv.ll
    M llvm/test/CodeGen/AMDGPU/gfx10-vop-literal.ll
    M llvm/test/CodeGen/AMDGPU/gfx12_scalar_subword_loads.ll
    M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll
    M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmax.ll
    M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmin.ll
    M llvm/test/CodeGen/AMDGPU/global-atomicrmw-fsub.ll
    M llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
    M llvm/test/CodeGen/AMDGPU/idiv-licm.ll
    M llvm/test/CodeGen/AMDGPU/integer-mad-patterns.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.intersect_ray.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.prefetch.data.ll
    M llvm/test/CodeGen/AMDGPU/llvm.mulo.ll
    M llvm/test/CodeGen/AMDGPU/load-constant-always-uniform.ll
    M llvm/test/CodeGen/AMDGPU/lrint.ll
    M llvm/test/CodeGen/AMDGPU/lround.ll
    M llvm/test/CodeGen/AMDGPU/machine-sink-temporal-divergence-swdev407790.ll
    M llvm/test/CodeGen/AMDGPU/mad_64_32.ll
    M llvm/test/CodeGen/AMDGPU/match-perm-extract-vector-elt-bug.ll
    M llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
    M llvm/test/CodeGen/AMDGPU/memmove-var-size.ll
    M llvm/test/CodeGen/AMDGPU/mul.ll
    M llvm/test/CodeGen/AMDGPU/offset-split-flat.ll
    M llvm/test/CodeGen/AMDGPU/offset-split-global.ll
    M llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
    M llvm/test/CodeGen/AMDGPU/reassoc-mul-add-1-to-mad.ll
    M llvm/test/CodeGen/AMDGPU/saddo.ll
    M llvm/test/CodeGen/AMDGPU/saddsat.ll
    M llvm/test/CodeGen/AMDGPU/shl64_reduce.ll
    M llvm/test/CodeGen/AMDGPU/shl_add_ptr_csub.ll
    M llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll
    M llvm/test/CodeGen/AMDGPU/ssubsat.ll
    M llvm/test/CodeGen/AMDGPU/sub.ll
    M llvm/test/CodeGen/AMDGPU/uaddsat.ll
    M llvm/test/CodeGen/AMDGPU/udiv.ll
    M llvm/test/CodeGen/AMDGPU/usubsat.ll
    M llvm/test/CodeGen/AMDGPU/vector-reduce-add.ll
    M llvm/test/CodeGen/AMDGPU/vgpr-mark-last-scratch-load.ll
    M llvm/test/MC/RISCV/rvi-valid.s
    A llvm/test/tools/llvm-reduce/reduce-arguments-invoke.ll
    A llvm/test/tools/llvm-reduce/reduce-arguments-non-callee-use.ll
    A llvm/test/tools/llvm-reduce/remove-arguments-preserve-wrong-cc.ll
    M llvm/tools/llvm-reduce/deltas/ReduceArguments.cpp
    M llvm/unittests/Analysis/ValueTrackingTest.cpp
    M llvm/utils/gn/secondary/llvm/lib/Support/BUILD.gn

  Log Message:
  -----------
  Simplify simm12 and uimm20 (lui,auipc)

Created using spr 1.3.5-bogner


Compare: https://github.com/llvm/llvm-project/compare/35830c252b11...4aa5e9b722c7

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list