[PATCH] D40183: [AMDGPU] Waitcnt pass. Add S_WAITCNT 0 if incomplete predecessor info
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 17 11:01:36 PST 2017
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1322
+ if (!Visited && !ScoreBrackets->getRevisitLoop()){
+ // pred not visited, so there better be no PredScoreBracket.
+ // Unfortunately, emit a s_waitcnt 0 since we don't have
----------------
s/pred/Pred
================
Comment at: test/CodeGen/AMDGPU/waitcnt-no-preds.ll:2
+; RUN: llc -mtriple=amdgcn -verify-machineinstrs < %s | FileCheck %s
+
+; check that the waitcnt pass inserts a S_WAITCNT 0 at the top of a
----------------
A smaller mir test is probably possible
================
Comment at: test/CodeGen/AMDGPU/waitcnt-no-preds.ll:6
+
+; CHECK-LABEL: BB0_3:
+; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
----------------
CHECK-LABEL is usually only used for function names
================
Comment at: test/CodeGen/AMDGPU/waitcnt-no-preds.ll:8
+; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-LABEL: BB0_4:
+; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
----------------
Regex at least for the first number
================
Comment at: test/CodeGen/AMDGPU/waitcnt-no-preds.ll:14
+ %21 = insertelement <2 x i32> <i32 undef, i32 1>, i32 %2, i32 0
+ %22 = bitcast <2 x i32> %21 to i64
+ %23 = inttoptr i64 %22 to [4294967295 x i8] addrspace(2)*
----------------
instnamer
================
Comment at: test/CodeGen/AMDGPU/waitcnt-no-preds.ll:90-98
+!opencl.kernels = !{}
+!spirv.EntryPoints = !{!0}
+!opencl.enable.FP_CONTRACT = !{}
+!spirv.Source = !{!2}
+!opencl.spir.version = !{!3}
+!opencl.ocl.version = !{!3}
+!opencl.used.extensions = !{!4}
----------------
Remove metadata
https://reviews.llvm.org/D40183
More information about the llvm-commits
mailing list