[PATCH] D40183: [AMDGPU] Waitcnt pass. Add S_WAITCNT 0 if incomplete predecessor info

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 17 11:01:36 PST 2017


arsenm added inline comments.


================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1322
+    if (!Visited && !ScoreBrackets->getRevisitLoop()){
+      // pred not visited, so there better be no PredScoreBracket.
+      // Unfortunately, emit a s_waitcnt 0 since we don't have
----------------
s/pred/Pred


================
Comment at: test/CodeGen/AMDGPU/waitcnt-no-preds.ll:2
+; RUN: llc -mtriple=amdgcn -verify-machineinstrs < %s | FileCheck %s
+
+; check that the waitcnt pass inserts a S_WAITCNT 0 at the top of a
----------------
A smaller mir test is probably possible


================
Comment at: test/CodeGen/AMDGPU/waitcnt-no-preds.ll:6
+
+; CHECK-LABEL: BB0_3:
+; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
----------------
CHECK-LABEL is usually only used for function names


================
Comment at: test/CodeGen/AMDGPU/waitcnt-no-preds.ll:8
+; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-LABEL: BB0_4:
+; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
----------------
Regex at least for the first number


================
Comment at: test/CodeGen/AMDGPU/waitcnt-no-preds.ll:14
+  %21 = insertelement <2 x i32> <i32 undef, i32 1>, i32 %2, i32 0
+  %22 = bitcast <2 x i32> %21 to i64
+  %23 = inttoptr i64 %22 to [4294967295 x i8] addrspace(2)*
----------------
instnamer


================
Comment at: test/CodeGen/AMDGPU/waitcnt-no-preds.ll:90-98
+!opencl.kernels = !{}
+!spirv.EntryPoints = !{!0}
+!opencl.enable.FP_CONTRACT = !{}
+!spirv.Source = !{!2}
+!opencl.spir.version = !{!3}
+!opencl.ocl.version = !{!3}
+!opencl.used.extensions = !{!4}
----------------
Remove metadata


https://reviews.llvm.org/D40183





More information about the llvm-commits mailing list