[PATCH] D31710: [AMDGPU] Fix for issue in alloca to vector promotion pass

Wed Apr 5 09:33:14 PDT 2017

arsenm added a comment.

Fixing crashes on it is good, but why are you spending so much effort on optimizing non-canonical IR? InstCombine decomposes aggregate loads and stores to loads and stores of the individual components. Similarly we give up on array allocas in the form where they are allocating N items of the element type.

================
Comment at: lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:483-486
+  // if (isa<GetElementPtrInst>(Ptr)) {
+  //   WorkList.insert(Inst);
+  //   return true;
+  // }
----------------
Commented out code

================
Comment at: test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:1
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -amdgpu-promote-alloca < %s | FileCheck %s
+
----------------
Can you change the check prefix to IR or OPT or something in case we want to add a codegen run line as well

================
Comment at: test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:5
+
+; CHECK-LABEL: @promote_1d_aggr
+
----------------
These should have ( to end the name to avoid potentially matching cases with the same prefix

================
Comment at: test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:10-11
+
+ at block = external addrspace(7) global %Block
+ at 0 = external addrspace(6) global %gl_PerVertex
+
----------------
addrspace 6/7?

================
Comment at: test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:11
+ at block = external addrspace(7) global %Block
+ at 0 = external addrspace(6) global %gl_PerVertex
+
----------------
Anonymous global

================
Comment at: test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:13
+
+; Function Attrs: nounwind
+define amdgpu_vs void @promote_1d_aggr() #0 {
----------------
Remove these comments

================
Comment at: test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:17
+  %f1 = alloca [1 x float]
+  %1 = getelementptr %Block, %Block addrspace(7)* @block, i32 0, i32 1
+  %2 = load i32, i32 addrspace(7)* %1
----------------
You should run opt -instnamer on these tests to avoid anonymous values

================
Comment at: test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:103
+; Function Attrs: nounwind
+define amdgpu_vs void @promote_matrix_aggr() #0 {
+  %f4 = alloca <4 x float>
----------------
Can this case be reduced?

https://reviews.llvm.org/D31710