[PATCH] D117118: [NVPTX] Fix shr/and pair replace with bfe

Artem Belevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 9 11:24:43 PST 2023


tra added a comment.

Does your patch make any difference for the first test case? AFACIT, it currently does not produce `bfe`: https://godbolt.org/z/94qG3E4xM

In the second example, we do produce `bfe.u64         %rd2, %rd1, 63, 3;` which extracts the field with upper bits outside of the input. AFAICT, PTX manual says (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#integer-arithmetic-instructions-bfe) that in cases like this, extracted field will replicate the MSB of the input value, which is indeed not what we want here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117118/new/

https://reviews.llvm.org/D117118



More information about the llvm-commits mailing list