[PATCH] D121371: [AMDGPUPromoteAlloca] Make compatible with opaque pointers

Thu Mar 10 07:23:52 PST 2022

nikic added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:433
+      // This is a store of the pointer, not to the pointer.
+      if (U->get() != Ptr)
         return false;
----------------
arsenm wrote:
> Doesn't this need to check the exact operand number in case a pointer is being stored to itself?
Yeah, at least far as intent goes. I don't think it's really relevant for correctness, but possibly for profitability.

================
Comment at: llvm/test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:40
 ;
   %i = alloca i32
   %f1 = alloca [1 x float]
----------------
arsenm wrote:
> Weird this test isn't using the right address space
Yeah, though I don't think it really matters for the transform. Should I adjust the test to use addrspace(5) for the allocas?

================
Comment at: llvm/test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:75-78
+; CHECK-NEXT:    [[TMP1:%.*]] = bitcast [2 x float]* [[F1]] to <2 x float>*
+; CHECK-NEXT:    [[TMP2:%.*]] = load <2 x float>, <2 x float>* [[TMP1]], align 8
+; CHECK-NEXT:    [[TMP3:%.*]] = insertelement <2 x float> [[TMP2]], float [[FOO3]], i32 0
+; CHECK-NEXT:    store <2 x float> [[TMP3]], <2 x float>* [[TMP1]], align 8
----------------
arsenm wrote:
> I'm not sure what's going on here, but it may be an artifact of the test using the wrong address space for the stack
This is an intended change: `bitcast` should get the same treatment as `gep p, 0, 0`. This transform generally relies on SROA and InstCombine to clean up the mess afterwards.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121371/new/

https://reviews.llvm.org/D121371