[PATCH] D74629: AMDGPU/GlobalISel: Improve 16-bit bswap
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 17 07:46:59 PST 2020
arsenm marked an inline comment as done.
arsenm added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/bswap.ll:348
+; GFX7-NEXT: s_or_b32 s0, s0, s1
+; GFX7-NEXT: s_bfe_u32 s0, s0, 0x100000
; GFX7-NEXT: ; return to shader part epilog
----------------
foad wrote:
> Why do we get masking both before and after the operation (s_and and s_bfe)? It seems like only one or the other should be required, depending on whether the upper bits of the register are undefined or defined to be zero.
We inserted a zext to satisfy the readfirstlane type constraint. We don't have really any optimizations that would take care of yet, and currently the readfirstlane would still be in the way when it would happen
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D74629/new/
https://reviews.llvm.org/D74629
More information about the llvm-commits
mailing list