[PATCH] D74629: AMDGPU/GlobalISel: Improve 16-bit bswap

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 17 07:46:59 PST 2020


arsenm marked an inline comment as done.
arsenm added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/bswap.ll:348
+; GFX7-NEXT:    s_or_b32 s0, s0, s1
+; GFX7-NEXT:    s_bfe_u32 s0, s0, 0x100000
 ; GFX7-NEXT:    ; return to shader part epilog
----------------
foad wrote:
> Why do we get masking both before and after the operation (s_and and s_bfe)? It seems like only one or the other should be required, depending on whether the upper bits of the register are undefined or defined to be zero.
We inserted a zext to satisfy the readfirstlane type constraint. We don't have really any optimizations that would take care of yet, and currently the readfirstlane would still be in the way when it would happen


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74629/new/

https://reviews.llvm.org/D74629





More information about the llvm-commits mailing list