[all-commits] [llvm/llvm-project] 2c091e: AMDGPU: Report unaligned scratch access as fast if...

Mon Sep 15 03:03:24 PDT 2025

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 2c091e6aec2d48fbcafc9cc5909a62f0321db1fd
      https://github.com/llvm/llvm-project/commit/2c091e6aec2d48fbcafc9cc5909a62f0321db1fd
  Author: macurtis-amd <macurtis at amd.com>
  Date:   2025-09-15 (Mon, 15 Sep 2025)

  Changed paths:
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/test/CodeGen/AMDGPU/memcpy-fixed-align.ll
    M llvm/test/CodeGen/AMDGPU/memcpy-libcall.ll
    M llvm/test/CodeGen/AMDGPU/memcpy-param-combinations.ll
    M llvm/test/CodeGen/AMDGPU/memmove-param-combinations.ll
    A llvm/test/Transforms/AggressiveInstCombine/AMDGPU/fold-consecutive-loads.ll
    A llvm/test/Transforms/AggressiveInstCombine/AMDGPU/lit.local.cfg

  Log Message:
  -----------
  AMDGPU: Report unaligned scratch access as fast if supported by tgt (#158036)

This enables more consecutive load folding during
aggressive-instcombine.

The original motivating example provided by Jeff Byrnes:
https://godbolt.org/z/8ebcTEjTs

Example provided by Nikita Popov: https://godbolt.org/z/Gv1j4vjqE as
part of my original attempt to fix the issue (PR
[#133301](https://github.com/llvm/llvm-project/pull/133301), see his
[comment](https://github.com/llvm/llvm-project/pull/133301#issuecomment-2984905809)).

This changes the value of `IsFast` returned by `In
SITargetLowering::allowsMisalignedMemoryAccessesImpl` to be non-zero for
private and flat addresses if the subtarget supports unaligned scratch
accesses.

This enables aggressive-instcombine to do more folding of consecutive
loads (see
[here](https://github.com/llvm/llvm-project/blob/cbd496581fb6953a9a8d8387a010cc3a67d4654b/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp#L811)).

Summary performance impact on
[composable_kernel](https://github.com/ROCm/composable_kernel):

|GPU|speedup (geomean*)|
|---|---|
|MI300A| 1.11|
|MI300X| 1.14|
|MI350X| 1.03|

[*] Just to be clear, this is the geomean across kernels which were
impacted by this change - not across all CK kernels.

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications