[llvm] [X86] combineTargetShuffle - fold VPERMV3(HI,MASK,LO) -> VPERMV(COMMUTE(MASK),CONCAT(LO,HI)) (PR #127199)
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 14 07:38:26 PST 2025
================
@@ -42530,9 +42531,25 @@ static SDValue combineTargetShuffle(SDValue N, const SDLoc &DL,
SmallVector<int, 32> Mask;
if (getTargetShuffleMask(N, /*AllowSentinelZero=*/false, SrcOps, Mask)) {
----------------
RKSimon wrote:
getTargetShuffleMask extracts the shuffle mask indices for us to make it easier to create a fresh commuted mask. As I mentioned in the summary, to commute this mask we'd have to just use a XOR of the msb index bits, and it almost never constant folds so we replace VPERMV3 for XOR+VPERMV which I wasn't sure was a good enough tradeoff.
https://github.com/llvm/llvm-project/pull/127199
More information about the llvm-commits
mailing list