[PATCH] D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic
Sebastian Neubauer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 12 02:20:04 PDT 2020
Flakebi marked an inline comment as done.
Flakebi added inline comments.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:3889
+ Function *NewF = Intrinsic::getDeclaration(
+ II->getModule(), Intrinsic::read_register, II->getType());
+ Metadata *MDArgs[] = {MDString::get(II->getContext(), "exec")};
----------------
Flakebi wrote:
> foad wrote:
> > Are you sure it's safe to use read_register(exec) for this? I'm not an expert but I have been told it's not safe, because the compiler doesn't understand that it can't reorder the read_register call past other instructions that modify exec.
> Is there a safe way to get the exec register? I’m fine with removing this optimization here and doing it only in the SDag.
Actually, icmp and fcmp already create `read_register(exec)` in InstCombine. So, having this combination and replacing icmp with ballot won’t change behavior.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D65088/new/
https://reviews.llvm.org/D65088
More information about the llvm-commits
mailing list