[llvm] [X86] combineTargetShuffle - fold VPERMV3(HI,MASK,LO) -> VPERMV(COMMUTE(MASK),CONCAT(LO,HI)) (PR #127199)

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 14 07:38:26 PST 2025


================
@@ -42530,9 +42531,25 @@ static SDValue combineTargetShuffle(SDValue N, const SDLoc &DL,
     SmallVector<int, 32> Mask;
     if (getTargetShuffleMask(N, /*AllowSentinelZero=*/false, SrcOps, Mask)) {
----------------
RKSimon wrote:

getTargetShuffleMask extracts the shuffle mask indices for us to make it easier to create a fresh commuted mask. As I mentioned in the summary, to commute this mask we'd have to just use a XOR of the msb index bits, and it almost never constant folds so we replace VPERMV3 for XOR+VPERMV which I wasn't sure was a good enough tradeoff.

https://github.com/llvm/llvm-project/pull/127199


More information about the llvm-commits mailing list