[PATCH] D134463: [AMDGPU] Use V_PERM to match buildvectors when inputs are not canonicalized (i.e. can't use V_PACK)
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 28 08:13:58 PDT 2022
arsenm added a comment.
In D134463#3820197 <https://reviews.llvm.org/D134463#3820197>, @foad wrote:
>> In the case where we want V[1].hi : V[0].low we can't lower to `V_ALIGNBIT_B32 $V0, $V1, 16` because that would incorrectly put the bits from $V0 as the MSBs in the dest. On the other hand `V_ALIGNBIT_B32 $V1, $V0, 16` correctly has the bits from $V1 as the MSBs, but they are the lower 16 (and the higher 16 from $V0).
>
> Good point. You could use V_BFI_B32 but I guess that is no better or worse than V_PERM_B32.
One small point in favor of BFI is the bitmask you need is more likely CSEable for unrelated uses
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D134463/new/
https://reviews.llvm.org/D134463
More information about the llvm-commits
mailing list