[all-commits] [llvm/llvm-project] b3995a: [AMDGPU] Decrease default NSA threshold from 3 to ...

Jay Foad via All-commits all-commits at lists.llvm.org
Tue Nov 19 07:54:49 PST 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: b3995aa338a2837626d31ae8fffc340d95b888ca
      https://github.com/llvm/llvm-project/commit/b3995aa338a2837626d31ae8fffc340d95b888ca
  Author: Jay Foad <jay.foad at amd.com>
  Date:   2024-11-19 (Tue, 19 Nov 2024)

  Changed paths:
    M llvm/lib/Target/AMDGPU/GCNSubtarget.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/bitcast_38_i16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/image-waterfall-loop-O0.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.atomic.dim.a16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.dim.a16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.a16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.store.2d.d16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.atomic.dim.a16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.gather4.a16.dim.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.load.3d.a16.ll
    M llvm/test/CodeGen/AMDGPU/adjust-writemask-cse.ll
    M llvm/test/CodeGen/AMDGPU/issue92561-restore-undef-scc-verifier-error.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.gather4.a16.dim.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.msaa.load.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.nsa.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.a16.dim.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.d16.dim.ll
    M llvm/test/CodeGen/AMDGPU/move-to-valu-vimage-vsample.ll
    M llvm/test/CodeGen/AMDGPU/splitkit-getsubrangeformask.ll
    M llvm/test/CodeGen/AMDGPU/vgpr-liverange-ir.ll
    M llvm/test/CodeGen/AMDGPU/wqm.ll

  Log Message:
  -----------
  [AMDGPU] Decrease default NSA threshold from 3 to 2 (#116624)

In graphics shaders it is better overall to use NSA encoding for IMAGE
instructions, because the benefit of less constrained register
allocation outweighs the cost of larger encoding. In particular NSA form
often avoids the need for extra V_MOV_B32 instructions between IMAGE
instructions, which can allow the IMAGE instructions to be claused.

Note that in GFX12 there is no longer a bit in the encoding to choose
between NSA and non-NSA forms, so this only affects GFX10 and GFX11.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list