[PATCH] D31710: [AMDGPU] Fix for issue in alloca to vector promotion pass
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 5 09:33:14 PDT 2017
arsenm added a comment.
Fixing crashes on it is good, but why are you spending so much effort on optimizing non-canonical IR? InstCombine decomposes aggregate loads and stores to loads and stores of the individual components. Similarly we give up on array allocas in the form where they are allocating N items of the element type.
================
Comment at: lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:483-486
+ // if (isa<GetElementPtrInst>(Ptr)) {
+ // WorkList.insert(Inst);
+ // return true;
+ // }
----------------
Commented out code
================
Comment at: test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:1
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -amdgpu-promote-alloca < %s | FileCheck %s
+
----------------
Can you change the check prefix to IR or OPT or something in case we want to add a codegen run line as well
================
Comment at: test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:5
+
+; CHECK-LABEL: @promote_1d_aggr
+
----------------
These should have ( to end the name to avoid potentially matching cases with the same prefix
================
Comment at: test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:10-11
+
+ at block = external addrspace(7) global %Block
+ at 0 = external addrspace(6) global %gl_PerVertex
+
----------------
addrspace 6/7?
================
Comment at: test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:11
+ at block = external addrspace(7) global %Block
+ at 0 = external addrspace(6) global %gl_PerVertex
+
----------------
Anonymous global
================
Comment at: test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:13
+
+; Function Attrs: nounwind
+define amdgpu_vs void @promote_1d_aggr() #0 {
----------------
Remove these comments
================
Comment at: test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:17
+ %f1 = alloca [1 x float]
+ %1 = getelementptr %Block, %Block addrspace(7)* @block, i32 0, i32 1
+ %2 = load i32, i32 addrspace(7)* %1
----------------
You should run opt -instnamer on these tests to avoid anonymous values
================
Comment at: test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll:103
+; Function Attrs: nounwind
+define amdgpu_vs void @promote_matrix_aggr() #0 {
+ %f4 = alloca <4 x float>
----------------
Can this case be reduced?
https://reviews.llvm.org/D31710
More information about the llvm-commits
mailing list