[llvm] [AArch64][GlobalISel] Combine G_EXTRACT_VECTOR_ELT and G_BUILD_VECTOR sequences into G_SHUFFLE_VECTOR (PR #110545)
via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 30 11:22:18 PDT 2024
================
@@ -4205,6 +4207,97 @@ void CombinerHelper::applyExtractVecEltBuildVec(MachineInstr &MI,
replaceSingleDefInstWithReg(MI, Reg);
}
+bool CombinerHelper::matchCombineExtractToShuffle(
+ MachineInstr &MI, SmallVectorImpl<std::pair<Register, int>> &VecIndexPair,
+ std::pair<Register, Register> &VectorRegisters) {
+ assert(MI.getOpcode() == TargetOpcode::G_BUILD_VECTOR);
+ const GBuildVector *Build = cast<GBuildVector>(&MI);
+ // This combine tries to find all the build vectors whose source elements
+ // all originate from a G_EXTRACT_VECTOR_ELT from one or two donor vectors.
+ // One example where this may happen is for AI chips where there are a lot
+ // of matrix multiplications. Typically there vectors are disected and then
+ // rearranged into the right transformation.
+ // E.g.
+ // %donor1(<2 x s32>) = COPY $d0
+ // %donor2(<2 x s32>) = COPY $d1
+ // %ext1 = G_EXTRACT_VECTOR_ELT %donor1, 0
+ // %ext2 = G_EXTRACT_VECTOR_ELT %donor1, 1
+ // %ext3 = G_EXTRACT_VECTOR_ELT %donor2, 0
+ // %ext4 = G_EXTRACT_VECTOR_ELT %donor2, 1
+ /// %vector = G_BUILD_VECTOR %ext1, %ext2, %ext3, %ext4
+ // ==>
+ // replace with:
+ // %vector = G_SHUFFLE_VECTOR %donor1, %donor2, shufflemask(0, 1, 2, 3)
+ SmallSetVector<Register, 2> RegisterVector;
+ const unsigned NumElements = Build->getNumSources();
+ for (unsigned Index = 0; Index < NumElements; Index++) {
+ Register SrcReg = peekThroughBitcast(Build->getSourceReg(Index), MRI);
+ auto *ExtractInstr = getOpcodeDef<GExtractVectorElement>(SrcReg, MRI);
+ if (!ExtractInstr)
+ return false;
+
+ // For shufflemasks we need to know exactly what index to place each element
+ // so if it this build vector doesn't use exclusively constants than we
+ // can't replace with a shufflevector
+ auto Cst = getIConstantVRegVal(ExtractInstr->getIndexReg(), MRI);
+ if (!Cst)
+ return false;
+ unsigned Idx = Cst->getZExtValue();
+
+ Register VectorReg = ExtractInstr->getVectorReg();
+ RegisterVector.insert(VectorReg);
+ VecIndexPair.emplace_back(std::make_pair(VectorReg, Idx));
+ }
+
+ // Create a pair so that we don't need to look for them later. This code is
+ // incorrect if we have more than two vectors in the set. Since we can only
+ // put two vectors in a shuffle, we reject any solution with more than two
+ // anyways.
+ VectorRegisters =
+ std::make_pair(RegisterVector.front(), RegisterVector.back());
+
+ // We check that they're the same type before running. We can also grow the
+ // smaller one to the target size, but there isn't an elegant way to do that
+ // until we have a good lowering for G_EXTRACT_SUBVECTOR.
+ if (MRI.getType(VectorRegisters.first) != MRI.getType(VectorRegisters.second))
+ return false;
+
----------------
ValentijnvdBeek wrote:
In the best case scenario, you would cast up the lower one into the right size by adding `G_CONCAT_VECTOR` instructions until they are in the same size. A few months ago the `G_EXTRACT_SUBVECTOR` was added and, very recently, a lot of work has gone into implementing it for the RISCV backend. But I don't think there is a nice fallback legalization strategy, which is what I would want before relying on it in the Combiner.
What has worked well (?) for me before is adding `G_CONCAT_VECTOR` and `G_UNMERGE` until the target has been reached. But, I am not sure if that is the best strategy. If anybody has a good idea, please let me know so that I can work on it.
https://github.com/llvm/llvm-project/pull/110545
More information about the llvm-commits
mailing list