[PATCH] D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 11 02:19:28 PDT 2020
foad added a comment.
In D65088#1916351 <https://reviews.llvm.org/D65088#1916351>, @Flakebi wrote:
> The code generation for test2 is currently not optimal:
>
> %trunc = trunc i32 %x to i1
> %ballot = call i64 @llvm.amdgcn.ballot.i64(i1 %trunc)
>
>
> generates
>
> v_and_b32_e32 v0, 1, v0
> v_cmp_eq_u32_e32 vcc, 1, v0
> s_and_b64 s[4:5], vcc, exec
>
>
> where the first compare stems from the truncate.
I'm confused by this. What is the optimal code generation?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D65088/new/
https://reviews.llvm.org/D65088
More information about the llvm-commits
mailing list