[llvm] [AMDGPU] add missing checks in processBaseWithConstOffset (PR #102310)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 9 03:46:30 PDT 2024
================
@@ -447,3 +447,33 @@ body: |
%13:vreg_64 = REG_SEQUENCE %9, %subreg.sub0, %11, %subreg.sub1
%15:vgpr_32 = FLAT_ATOMIC_ADD_RTN %13:vreg_64, %0.sub0, 0, 0, implicit $exec, implicit $flat_scr
...
+
+---
----------------
arsenm wrote:
Can compact the register numbers with run-pass=none, and don't need the control flow. Could probably get this a bit smaller:
```
---
# GCN-LABEL: name: negative_offset_nullptr
# GCN: V_ADD_CO_U32_e64 -1, 0, 0
# GCN: V_ADDC_U32_e64 -1, %{{[0-9]+}}, %{{[0-9]+}}, 0
name: negative_offset_nullptr
tracksRegLiveness: true
body: |
bb.0:
%0:sreg_64 = S_MOV_B64 $src_private_base
%1:sreg_32 = S_MOV_B32 0
%2:sreg_64 = REG_SEQUENCE %1, %subreg.sub0, %0.sub1, %subreg.sub1
%3:vgpr_32, %4:sreg_64_xexec = V_ADD_CO_U32_e64 -1, 0, 0, implicit $exec
%5:vgpr_32 = COPY %2.sub1
%6:vgpr_32, %7:sreg_64 = V_ADDC_U32_e64 -1, %5, %4, 0, implicit $exec
%8:vreg_64 = REG_SEQUENCE %3, %subreg.sub0, %6, %subreg.sub1
%9:vgpr_32 = FLAT_LOAD_UBYTE %8, 0, 0, implicit $exec, implicit $flat_scr
S_ENDPGM 0, implicit %9
...
```
https://github.com/llvm/llvm-project/pull/102310
More information about the llvm-commits
mailing list