[PATCH] D75293: [AMDGPU] Enable runtime unroll for LDS
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 27 13:04:47 PST 2020
rampitec created this revision.
rampitec added reviewers: cfang, sameerds, arsenm, yaxunl, AlexVlx.
Herald added subscribers: kerbowa, zzheng, hiraditya, t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl.
Herald added a project: LLVM.
arsenm accepted this revision.
This revision is now accepted and ready to land.
arsenm added a comment.
I’ve thought might want to always do runtime unroll
rampitec added a comment.
In D75293#1896535 <https://reviews.llvm.org/D75293#1896535>, @arsenm wrote:
> I’ve thought might want to always do runtime unroll
We might, but so far there is not enough justification.
We want to do unroll for LDS even for runtime trip count
to combine LDS operations
https://reviews.llvm.org/D75293
Files:
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
llvm/test/CodeGen/AMDGPU/unroll.ll
Index: llvm/test/CodeGen/AMDGPU/unroll.ll
===================================================================
--- llvm/test/CodeGen/AMDGPU/unroll.ll
+++ llvm/test/CodeGen/AMDGPU/unroll.ll
@@ -99,3 +99,37 @@
for.end: ; preds = %for.cond
ret void
}
+
+; Check that runtime unroll is enabled for local memory references
+
+; CHECK-LABEL: @local_memory_runtime
+; CHECK: loop.header:
+; CHECK: load i32, i32 addrspace(3)*
+; CHECK: load i32, i32 addrspace(3)*
+; CHECK: br i1
+; CHECK: loop.header.epil
+; CHECK: load i32, i32 addrspace(3)*
+; CHECK: ret
+define amdgpu_kernel void @local_memory_runtime(i32 addrspace(1)* %out, i32 addrspace(3)* %lds, i32 %n) {
+entry:
+ br label %loop.header
+
+loop.header:
+ %counter = phi i32 [0, %entry], [%inc, %loop.inc]
+ br label %loop.body
+
+loop.body:
+ %ptr_lds = getelementptr i32, i32 addrspace(3)* %lds, i32 %counter
+ %val = load i32, i32 addrspace(3)* %ptr_lds
+ %ptr_out = getelementptr i32, i32 addrspace(1)* %out, i32 %counter
+ store i32 %val, i32 addrspace(1)* %ptr_out
+ br label %loop.inc
+
+loop.inc:
+ %inc = add i32 %counter, 1
+ %cond = icmp sge i32 %counter, %n
+ br i1 %cond, label %exit, label %loop.header
+
+exit:
+ ret void
+}
Index: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
@@ -69,6 +69,11 @@
cl::desc("Unroll threshold increment for AMDGPU for each if statement inside loop"),
cl::init(150), cl::Hidden);
+static cl::opt<bool> UnrollRuntimeLocal(
+ "amdgpu-unroll-runtime-local",
+ cl::desc("Allow runtime unroll for AMDGPU if local memory used in a loop"),
+ cl::init(true), cl::Hidden);
+
static cl::opt<bool> UseLegacyDA(
"amdgpu-use-legacy-divergence-analysis",
cl::desc("Enable legacy divergence analysis for AMDGPU"),
@@ -177,6 +182,9 @@
(!isa<GlobalVariable>(GEP->getPointerOperand()) &&
!isa<Argument>(GEP->getPointerOperand())))
continue;
+ LLVM_DEBUG(dbgs() << "Allow unroll runtime for loop:\n"
+ << *L << " due to LDS use.\n");
+ UP.Runtime = UnrollRuntimeLocal;
}
// Check if GEP depends on a value defined by this loop itself.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D75293.247072.patch
Type: text/x-patch
Size: 2372 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200227/af90392b/attachment-0001.bin>
More information about the llvm-commits
mailing list