[PATCH] D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 11 03:03:47 PDT 2020
foad added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:428
+ (i64 (int_amdgcn_ballot i1:$src)),
+ (S_AND_B64 (i64 (COPY_TO_REGCLASS $src, SReg_64)), (i64 EXEC))
+>;
----------------
If $src is the result of a v_cmp instruction then this s_and will be redundant, won't it? Because v_cmp is defined to return a zero in vcc for all inactive lanes.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:3889
+ Function *NewF = Intrinsic::getDeclaration(
+ II->getModule(), Intrinsic::read_register, II->getType());
+ Metadata *MDArgs[] = {MDString::get(II->getContext(), "exec")};
----------------
Are you sure it's safe to use read_register(exec) for this? I'm not an expert but I have been told it's not safe, because the compiler doesn't understand that it can't reorder the read_register call past other instructions that modify exec.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D65088/new/
https://reviews.llvm.org/D65088
More information about the llvm-commits
mailing list