[PATCH] D74568: AMDGPU/GlobalISel: Handle G_BSWAP
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 14 08:41:24 PST 2020
arsenm marked an inline comment as done.
arsenm added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/bswap.ll:18
+; GFX8: ; %bb.0:
+; GFX8-NEXT: v_mov_b32_e32 v0, s0
+; GFX8-NEXT: s_mov_b32 s0, 0x10203
----------------
foad wrote:
> Just curious: why is this v_mov needed? Can't v_perm read this value directly from s0?
This would violate the constant bus restriction. This could be folded on gfx10 where the limit is 2. However, this is only a problem because the constant is an SGPR in the first place. If we materialized the mask in a VGPR, we could fold it. We don't try to optimize this case yet
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D74568/new/
https://reviews.llvm.org/D74568
More information about the llvm-commits
mailing list