[PATCH] D75293: [AMDGPU] Enable runtime unroll for LDS

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 27 13:04:47 PST 2020


rampitec created this revision.
rampitec added reviewers: cfang, sameerds, arsenm, yaxunl, AlexVlx.
Herald added subscribers: kerbowa, zzheng, hiraditya, t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl.
Herald added a project: LLVM.
arsenm accepted this revision.
This revision is now accepted and ready to land.
arsenm added a comment.

I’ve thought might want to always do runtime unroll


rampitec added a comment.

In D75293#1896535 <https://reviews.llvm.org/D75293#1896535>, @arsenm wrote:

> I’ve thought might want to always do runtime unroll


We might, but so far there is not enough justification.


We want to do unroll for LDS even for runtime trip count
to combine LDS operations


https://reviews.llvm.org/D75293

Files:
  llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
  llvm/test/CodeGen/AMDGPU/unroll.ll


Index: llvm/test/CodeGen/AMDGPU/unroll.ll
===================================================================
--- llvm/test/CodeGen/AMDGPU/unroll.ll
+++ llvm/test/CodeGen/AMDGPU/unroll.ll
@@ -99,3 +99,37 @@
 for.end:                                          ; preds = %for.cond
   ret void
 }
+
+; Check that runtime unroll is enabled for local memory references
+
+; CHECK-LABEL: @local_memory_runtime
+; CHECK: loop.header:
+; CHECK: load i32, i32 addrspace(3)*
+; CHECK: load i32, i32 addrspace(3)*
+; CHECK: br i1
+; CHECK: loop.header.epil
+; CHECK: load i32, i32 addrspace(3)*
+; CHECK: ret
+define amdgpu_kernel void @local_memory_runtime(i32 addrspace(1)* %out, i32 addrspace(3)* %lds, i32 %n) {
+entry:
+  br label %loop.header
+
+loop.header:
+  %counter = phi i32 [0, %entry], [%inc, %loop.inc]
+  br label %loop.body
+
+loop.body:
+  %ptr_lds = getelementptr i32, i32 addrspace(3)* %lds, i32 %counter
+  %val = load i32, i32 addrspace(3)* %ptr_lds
+  %ptr_out = getelementptr i32, i32 addrspace(1)* %out, i32 %counter
+  store i32 %val, i32 addrspace(1)* %ptr_out
+  br label %loop.inc
+
+loop.inc:
+  %inc = add i32 %counter, 1
+  %cond = icmp sge i32 %counter, %n
+  br i1 %cond, label  %exit, label %loop.header
+
+exit:
+  ret void
+}
Index: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
@@ -69,6 +69,11 @@
   cl::desc("Unroll threshold increment for AMDGPU for each if statement inside loop"),
   cl::init(150), cl::Hidden);
 
+static cl::opt<bool> UnrollRuntimeLocal(
+  "amdgpu-unroll-runtime-local",
+  cl::desc("Allow runtime unroll for AMDGPU if local memory used in a loop"),
+  cl::init(true), cl::Hidden);
+
 static cl::opt<bool> UseLegacyDA(
   "amdgpu-use-legacy-divergence-analysis",
   cl::desc("Enable legacy divergence analysis for AMDGPU"),
@@ -177,6 +182,9 @@
             (!isa<GlobalVariable>(GEP->getPointerOperand()) &&
              !isa<Argument>(GEP->getPointerOperand())))
           continue;
+        LLVM_DEBUG(dbgs() << "Allow unroll runtime for loop:\n"
+                          << *L << " due to LDS use.\n");
+        UP.Runtime = UnrollRuntimeLocal;
       }
 
       // Check if GEP depends on a value defined by this loop itself.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D75293.247072.patch
Type: text/x-patch
Size: 2372 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200227/af90392b/attachment-0001.bin>


More information about the llvm-commits mailing list