[PATCH] D117118: [NVPTX] Fix shr/and pair replace with bfe
Artem Belevich via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 9 11:24:43 PST 2023
tra added a comment.
Does your patch make any difference for the first test case? AFACIT, it currently does not produce `bfe`: https://godbolt.org/z/94qG3E4xM
In the second example, we do produce `bfe.u64 %rd2, %rd1, 63, 3;` which extracts the field with upper bits outside of the input. AFAICT, PTX manual says (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#integer-arithmetic-instructions-bfe) that in cases like this, extracted field will replicate the MSB of the input value, which is indeed not what we want here.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D117118/new/
https://reviews.llvm.org/D117118
More information about the llvm-commits
mailing list