[PATCH] D96405: [DAGCombiner] Improve reduceBuildVecToShuffle Performance
Michael Marjieh via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 11 01:32:58 PST 2021
mmarjieh added inline comments.
================
Comment at: llvm/test/CodeGen/X86/vector-shuffle-combining-avx512bwvl.ll:115
+; X86-NEXT: vpsraw $8, %xmm1, %xmm1
+; X86-NEXT: vpunpcklqdq {{.*#+}} ymm0 = ymm0[0],ymm1[0],ymm0[2],ymm1[2]
+; X86-NEXT: vmovdqu %ymm0, (%eax)
----------------
RKSimon wrote:
> At quick glance - this looks wrong, I'd expect this still to be the same vshufpd?
I am not familiar with X86's ISA.
Can you explain why?
Meanwhile, I will show you the difference in the DAG after my patch:
Before this patch:
SelectionDAG has 28 nodes:
t0: ch = EntryToken
t5: v4i64,ch = load<(load 32 from `<4 x i64>* null`, align 8)> t0, Constant:i32<0>, undef:i32
t6: v4i64,ch = load<(load 32 from `<4 x i64>* undef`, align 8)> t0, undef:i32, undef:i32
t21: ch = TokenFactor t5:1, t6:1
t36: v8i16 = BUILD_VECTOR Constant:i16<0>, Constant:i16<0>, Constant:i16<0>, Constant:i16<0>, undef:i16, undef:i16, undef:i16, undef:i16
t52: v2f64 = bitcast t36
t62: v4f64 = concat_vectors t52, undef:v2f64
t63: v4f64 = vector_shuffle<u,u,0,0> t62, undef:v4f64
t37: v8i16 = X86ISD::VTRUNC t5
t39: v8i16 = sign_extend_inreg t37, ValueType:ch:v8i8
t44: v2f64 = bitcast t39
t40: v8i16 = X86ISD::VTRUNC t6
t41: v8i16 = sign_extend_inreg t40, ValueType:ch:v8i8
t49: v2f64 = bitcast t41
t59: v4f64 = concat_vectors t44, t49
t68: v4f64 = vector_shuffle<4,6,2,3> t63, t59
t3: i32,ch = load<(load 4 from %fixed-stack.0)> t0, FrameIndex:i32<-1>, undef:i32
t57: ch = store<(store 32 into %ir.10, align 2)> t21, t68, t3, undef:i32
t29: ch = X86ISD::RET_FLAG t57, TargetConstant:i32<0>
After this patch:
SelectionDAG has 26 nodes:
t0: ch = EntryToken
t5: v4i64,ch = load<(load 32 from `<4 x i64>* null`, align 8)> t0, Constant:i32<0>, undef:i32
t6: v4i64,ch = load<(load 32 from `<4 x i64>* undef`, align 8)> t0, undef:i32, undef:i32
t21: ch = TokenFactor t5:1, t6:1
t37: v8i16 = X86ISD::VTRUNC t5
t39: v8i16 = sign_extend_inreg t37, ValueType:ch:v8i8
t44: v2f64 = bitcast t39
t40: v8i16 = X86ISD::VTRUNC t6
t41: v8i16 = sign_extend_inreg t40, ValueType:ch:v8i8
t49: v2f64 = bitcast t41
t59: v4f64 = concat_vectors t44, t49
t36: v8i16 = BUILD_VECTOR Constant:i16<0>, Constant:i16<0>, Constant:i16<0>, Constant:i16<0>, undef:i16, undef:i16, undef:i16, undef:i16
t52: v2f64 = bitcast t36
t62: v4f64 = concat_vectors t52, undef:v2f64
t64: v4f64 = vector_shuffle<0,2,4,4> t59, t62
t3: i32,ch = load<(load 4 from %fixed-stack.0)> t0, FrameIndex:i32<-1>, undef:i32
t57: ch = store<(store 32 into %ir.10, align 2)> t21, t64, t3, undef:i32
t29: ch = X86ISD::RET_FLAG t57, TargetConstant:i32<0>
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D96405/new/
https://reviews.llvm.org/D96405
More information about the llvm-commits
mailing list