[PATCH] D134463: [AMDGPU] Use V_PERM to match buildvectors when inputs are not canonicalized (i.e. can't use V_PACK)

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 28 08:13:58 PDT 2022


arsenm added a comment.

In D134463#3820197 <https://reviews.llvm.org/D134463#3820197>, @foad wrote:

>> In the case where we want V[1].hi : V[0].low we can't lower to `V_ALIGNBIT_B32 $V0, $V1, 16` because that would incorrectly put the bits from $V0 as the MSBs in the dest. On the other hand `V_ALIGNBIT_B32 $V1, $V0, 16` correctly has the bits from $V1 as the MSBs,  but they are the lower 16 (and the higher 16 from $V0).
>
> Good point. You could use V_BFI_B32 but I guess that is no better or worse than V_PERM_B32.

One small point in favor of BFI is the bitmask you need is more likely CSEable for unrelated uses


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134463/new/

https://reviews.llvm.org/D134463



More information about the llvm-commits mailing list