[llvm] [AArch64] Generate rev16 for certain uses of __builtin_bswap16 (PR #105375)
via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 3 14:09:21 PDT 2024
adprasad-nvidia wrote:
@DTeachs Thanks for pointing this out. As you pointed out, `rev16` should only be generated from `__builtin_bswap16` when the result is used as an i16, as if it's used as an i32, the top bits are still needed.
The updated patch now checks for this, and only lowers to `rev16` if the result is used as an i16. It does this by checking that the `bswap` is followed by an `any_extend` - which means the top half is not used i.e. result is used as an i16. If it were instead followed by `zero_extend`, the top half would be used, and the updated patch does not lower to `rev16`.
I have added a test in CodeGen/AArch64/bswap.ll that checks `bswap` is not lowered to `rev16` when the result is used as an i32.
https://github.com/llvm/llvm-project/pull/105375
More information about the llvm-commits
mailing list