[all-commits] [llvm/llvm-project] fbdf8a: [LSV] Merge contiguous chains across scalar types ...

Mon Dec 1 20:05:38 PST 2025

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: fbdf8ab59005bc35f23b3167e0783013c7ee5fa4
      https://github.com/llvm/llvm-project/commit/fbdf8ab59005bc35f23b3167e0783013c7ee5fa4
  Author: Anshil Gandhi <95053726+gandhi56 at users.noreply.github.com>
  Date:   2025-12-01 (Mon, 01 Dec 2025)

  Changed paths:
    M llvm/include/llvm/Transforms/Utils/Local.h
    M llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp
    M llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
    M llvm/lib/Transforms/Scalar/SROA.cpp
    M llvm/lib/Transforms/Utils/Local.cpp
    M llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/amdgpu-irtranslator.ll
    M llvm/test/CodeGen/AMDGPU/agpr-copy-no-free-registers.ll
    M llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll
    M llvm/test/CodeGen/AMDGPU/build_vector.ll
    M llvm/test/CodeGen/AMDGPU/fabs.bf16.ll
    M llvm/test/CodeGen/AMDGPU/fabs.f16.ll
    M llvm/test/CodeGen/AMDGPU/fabs.ll
    M llvm/test/CodeGen/AMDGPU/fcopysign.f32.ll
    M llvm/test/CodeGen/AMDGPU/fdiv.ll
    M llvm/test/CodeGen/AMDGPU/fnearbyint.ll
    M llvm/test/CodeGen/AMDGPU/fneg-fabs.bf16.ll
    M llvm/test/CodeGen/AMDGPU/fneg-fabs.f16.ll
    M llvm/test/CodeGen/AMDGPU/fneg-fabs.ll
    M llvm/test/CodeGen/AMDGPU/fneg.ll
    M llvm/test/CodeGen/AMDGPU/fp_to_sint.ll
    M llvm/test/CodeGen/AMDGPU/fp_to_uint.ll
    M llvm/test/CodeGen/AMDGPU/fshl.ll
    M llvm/test/CodeGen/AMDGPU/fshr.ll
    M llvm/test/CodeGen/AMDGPU/global_atomics.ll
    M llvm/test/CodeGen/AMDGPU/half.ll
    M llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll
    M llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll
    M llvm/test/CodeGen/AMDGPU/kernel-args.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.gfx950.ll
    M llvm/test/CodeGen/AMDGPU/llvm.exp.ll
    M llvm/test/CodeGen/AMDGPU/llvm.exp10.ll
    M llvm/test/CodeGen/AMDGPU/llvm.exp2.ll
    M llvm/test/CodeGen/AMDGPU/llvm.log.ll
    M llvm/test/CodeGen/AMDGPU/llvm.log10.ll
    M llvm/test/CodeGen/AMDGPU/llvm.log2.ll
    M llvm/test/CodeGen/AMDGPU/max.ll
    M llvm/test/CodeGen/AMDGPU/min.ll
    M llvm/test/CodeGen/AMDGPU/packed-fp32.ll
    M llvm/test/CodeGen/AMDGPU/rotl.ll
    M llvm/test/CodeGen/AMDGPU/rotr.ll
    M llvm/test/CodeGen/AMDGPU/s_addk_i32.ll
    M llvm/test/CodeGen/AMDGPU/sminmax.v2i16.ll
    M llvm/test/CodeGen/AMDGPU/store-to-constant.ll
    M llvm/test/CodeGen/AMDGPU/udivrem.ll
    M llvm/test/CodeGen/AMDGPU/uint_to_fp.f64.ll
    A llvm/test/Transforms/InstCombine/copy-access-metadata.ll
    A llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/copy-metadata-load-store.ll
    M llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/merge-vectors-complex.ll
    M llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/merge-vectors.ll

  Log Message:
  -----------
  [LSV] Merge contiguous chains across scalar types (#154069)

This change enables the LoadStoreVectorizer to merge and vectorize
contiguous chains even when their scalar element types differ, as long
as the total bitwidth matches. To do so, we rebase offsets between
chains, normalize value types to a common integer type, and insert the
necessary casts around loads and stores. This uncovers more
vectorization opportunities and explains the expected codegen updates
across AMDGPU tests.

Key changes:
- Chain merging
  - Build contiguous subchains and then merge adjacent ones when:
- They refer to the same underlying pointer object and address space.
    - They are either all loads or all stores.
    - A constant leader-to-leader delta exists.
- Rebasing one chain into the other's coordinate space does not overlap.
    - All elements have equal total bit width.
- Rebase the second chain by the computed delta and append it to the
first.

- Type normalization and casting
- Normalize merged chains to a common integer type sized to the total
bits.
- For loads: create a new load of the normalized type, copy metadata,
and cast back to the original type for uses if needed.
  - For stores: bitcast the value to the normalized type and store that.
- Insert zext/trunc for integer size changes; use bit-or-pointer casts
when sizes match.

- Cleanups
  - Erase replaced instructions and DCE pointer operands when safe.
- New helpers: computeLeaderDelta, chainsOverlapAfterRebase,
rebaseChain, normalizeChainToType, and allElemsMatchTotalBits.

Impact:
- Increases vectorization opportunities across mixed-typed but
size-compatible access chains.
- Large set of expected AMDGPU codegen diffs due to more/changed
vectorization.

This PR resolves #97715.

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications