[PATCH] D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 11 03:03:47 PDT 2020


foad added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:428
+  (i64 (int_amdgcn_ballot i1:$src)),
+  (S_AND_B64 (i64 (COPY_TO_REGCLASS $src, SReg_64)), (i64 EXEC))
+>;
----------------
If $src is the result of a v_cmp instruction then this s_and will be redundant, won't it? Because v_cmp is defined to return a zero in vcc for all inactive lanes.


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:3889
+        Function *NewF = Intrinsic::getDeclaration(
+            II->getModule(), Intrinsic::read_register, II->getType());
+        Metadata *MDArgs[] = {MDString::get(II->getContext(), "exec")};
----------------
Are you sure it's safe to use read_register(exec) for this? I'm not an expert but I have been told it's not safe, because the compiler doesn't understand that it can't reorder the read_register call past other instructions that modify exec.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65088/new/

https://reviews.llvm.org/D65088





More information about the llvm-commits mailing list