[llvm] [X86][AVX] Match v4f64 blend from shuffle of scalar values. (PR #135753)
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 15 06:40:27 PDT 2025
================
@@ -9040,6 +9040,39 @@ X86TargetLowering::LowerBUILD_VECTOR(SDValue Op, SelectionDAG &DAG) const {
MVT OpEltVT = Op.getOperand(0).getSimpleValueType();
unsigned NumElems = Op.getNumOperands();
+ // Match BUILD_VECTOR of scalars that we can lower to X86ISD::BLENDI via
+ // shuffles.
+ //
+ // v4f64 = BUILD_VECTOR X,Y,Y,X
+ // >>>
+ // t1: v4f64 = BUILD_VECTOR X,u,u,u
+ // t3: v4f64 = vector_shuffle<0,u,u,0> t1, u
+ // t2: v4f64 = BUILD_VECTOR Y,u,u,u
+ // t4: v4f64 = vector_shuffle<u,0,0,u> t2, u
+ // v4f64 = vector_shuffle<0,5,6,3> t3, t4
+ //
+ if (Subtarget.hasAVX() && VT == MVT::v4f64 && Op->getNumOperands() == 4u) {
----------------
RKSimon wrote:
we can generalize this to handle any "blend of a pair of splats" pattern - with SSE41 or later - with 16-bit elements or larger - if there are just 2 scalars.
https://github.com/llvm/llvm-project/pull/135753
More information about the llvm-commits
mailing list