[PATCH] D151703: [AMDGPU][LSV] Restrict forming extra large vectors

Tue May 30 04:59:00 PDT 2023

piotr created this revision.
Herald added subscribers: foad, kerbowa, hiraditya, tpr, dstuttard, yaxunl, jvesely, kzhuravl, arsenm.
Herald added a project: All.
piotr requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.

Restrict the bitwidth of the largest vector type used in ls vectorizer
to 128 for buffer and constant addr spaces.

This avoids a potential sgpr pressure increase in shaders where multiple
resources are used. There is no enough context in LSV to determine if forming
large vectors is beneficial for perf, and currently there is no late phase in
the compiler that would split vectors if register pressure were too high (it
could be argued that one should be added).

The extra large loads/store could still be formed late in the backend in
si-load-store-optimizer which has also some logic to avoid unbounded register
pressure increases, with the exception of s_load_dwordx16 which is not formed
there. s_load_dwordx16 is a tricky instruction to get right anyway, because
it can cause massive register pressure and fragmentation.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D151703

Files:
  llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
  llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll
  llvm/test/CodeGen/AMDGPU/add.v2i16.ll
  llvm/test/CodeGen/AMDGPU/branch-folding-implicit-def-subreg.ll
  llvm/test/CodeGen/AMDGPU/dagcomb-extract-vec-elt-different-sizes.ll
  llvm/test/CodeGen/AMDGPU/fcopysign.f64.ll
  llvm/test/CodeGen/AMDGPU/flat_atomics_i64.ll
  llvm/test/CodeGen/AMDGPU/frem.ll
  llvm/test/CodeGen/AMDGPU/global_atomics_i64.ll
  llvm/test/CodeGen/AMDGPU/hsa-metadata-kernel-code-props-v3.ll
  llvm/test/CodeGen/AMDGPU/hsa-metadata-kernel-code-props.ll
  llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll
  llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll
  llvm/test/CodeGen/AMDGPU/llvm.maxnum.f16.ll
  llvm/test/CodeGen/AMDGPU/llvm.minnum.f16.ll
  llvm/test/CodeGen/AMDGPU/load-constant-f64.ll
  llvm/test/CodeGen/AMDGPU/mul.ll
  llvm/test/CodeGen/AMDGPU/multi-divergent-exit-region.ll
  llvm/test/CodeGen/AMDGPU/sdiv64.ll
  llvm/test/CodeGen/AMDGPU/sdwa-peephole.ll
  llvm/test/CodeGen/AMDGPU/sgpr-control-flow.ll
  llvm/test/CodeGen/AMDGPU/shift-i128.ll
  llvm/test/CodeGen/AMDGPU/soft-clause-exceeds-register-budget.ll
  llvm/test/CodeGen/AMDGPU/srem64.ll
  llvm/test/CodeGen/AMDGPU/sub.v2i16.ll
  llvm/test/CodeGen/AMDGPU/trunc-combine.ll
  llvm/test/CodeGen/AMDGPU/uaddo.ll
  llvm/test/CodeGen/AMDGPU/udiv64.ll
  llvm/test/CodeGen/AMDGPU/urem64.ll
  llvm/test/CodeGen/AMDGPU/usubo.ll
  llvm/test/CodeGen/AMDGPU/wave32.ll
  llvm/test/Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D151703.526583.patch
Type: text/x-patch
Size: 335580 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230530/b4fe4ef5/attachment-0001.bin>