[llvm] [X86] Fold VPERMV(MASK,CONCAT(LO,HI)) -> VPERMV3(WIDEN(LO),MASK',WIDEN(HI)) (PR #129708)
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 6 02:02:46 PST 2025
================
@@ -42607,6 +42607,43 @@ static SDValue combineTargetShuffle(SDValue N, const SDLoc &DL,
return SDValue();
}
+ case X86ISD::VPERMV: {
+ // Combine VPERMV to VPERMV3 if the source operand can be freely split.
+ SmallVector<int, 32> Mask;
+ SmallVector<SDValue, 2> SrcOps, SubOps;
+ SDValue Src = peekThroughBitcasts(N.getOperand(1));
+ if ((Subtarget.hasVLX() ||
+ (VT.is512BitVector() && Subtarget.hasAVX512())) &&
----------------
RKSimon wrote:
We already have a 512-bit VPERMV node at this point - useAVX512Regs shouldn't be necessary, but we can add it as an additional precaution if you want.
https://github.com/llvm/llvm-project/pull/129708
More information about the llvm-commits
mailing list