[PATCH] D77315: AMDGPU: Hack out noinline on functions using LDS globals
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 2 09:12:23 PDT 2020
arsenm created this revision.
arsenm added reviewers: rampitec, yaxunl.
Herald added subscribers: kerbowa, hiraditya, t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl.
This is a workaround for clang adding noinline to all functions at
-O0. Previously, we would just add alwaysinline, and the verifier
would complain about having both noinline and alwaysinline. We
currently can't truly codegen this case as a freestanding function, so
override the user forcing noinline.
https://reviews.llvm.org/D77315
Files:
llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
llvm/test/CodeGen/AMDGPU/force-alwaysinline-lds-global-address.ll
Index: llvm/test/CodeGen/AMDGPU/force-alwaysinline-lds-global-address.ll
===================================================================
--- llvm/test/CodeGen/AMDGPU/force-alwaysinline-lds-global-address.ll
+++ llvm/test/CodeGen/AMDGPU/force-alwaysinline-lds-global-address.ll
@@ -74,4 +74,21 @@
ret i32 %call
}
+; Test we don't break the IR and have both alwaysinline and noinline
+; FIXME: We should really not override noinline.
+
+; ALL-LABEL: define i32 @load_lds_simple_noinline() #0 {
+define i32 @load_lds_simple_noinline() noinline {
+ %load = load i32, i32 addrspace(3)* @lds0, align 4
+ ret i32 %load
+}
+
+; ALL-LABEL: define i32 @recursive_call_lds_noinline(i32 %arg0) #0 {
+define i32 @recursive_call_lds_noinline(i32 %arg0) noinline {
+ %load = load i32, i32 addrspace(3)* @lds0, align 4
+ %add = add i32 %arg0, %load
+ %call = call i32 @recursive_call_lds(i32 %add)
+ ret i32 %call
+}
+
; ALL: attributes #0 = { alwaysinline }
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -71,6 +71,13 @@
if (Instruction *I = dyn_cast<Instruction>(U)) {
Function *F = I->getParent()->getParent();
if (!AMDGPU::isEntryFunctionCC(F->getCallingConv())) {
+ // FIXME: This is a horrible hack. We should always respect noinline,
+ // and just let us hit the error when we can't handle this.
+ //
+ // Unfortunately, clang adds noinline to all functions at -O0. We have
+ // to override this here. until that's fixed.
+ F->removeFnAttr(Attribute::NoInline);
+
FuncsToAlwaysInline.insert(F);
Stack.push_back(F);
}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D77315.254539.patch
Type: text/x-patch
Size: 1800 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200402/03c63bd4/attachment-0001.bin>
More information about the llvm-commits
mailing list