[PATCH] [X86, AVX] use blends instead of insert128 with index 0

Andrea Di Biagio Andrea_DiBiagio at sn.scee.net
Thu Mar 19 13:13:07 PDT 2015

Thanks Sanjay.
I made a couple of comments (see below). Otherwise the patch looks good to me.

Comment at: lib/Target/X86/X86ISelLowering.cpp:185-186
@@ +184,4 @@
+      SDValue Mask = DAG.getConstant(MaskVal, MVT::i8);
+      SDValue Vec2 = DAG.getNode(ISD::INSERT_SUBVECTOR, dl, ResultVT, Undef,
+                                   Vec, ZeroIndex);
+      return DAG.getNode(X86ISD::BLENDI, dl, ResultVT, Result, Vec2, Mask);
So, the INSERT_SUBVECTOR node is always generated regardless of whether 'ScalarType' is floating point or not. I think it makes sense to factor out the common logic between the floating point and the integer case. For example, you can create the INSERT_SUBVECTOR node immedately after line 175. This will allow you to get rid of the code at around line 203.

Comment at: test/CodeGen/X86/avx-cast.ll:9-12
@@ -6,1 +8,6 @@
 define <8 x float> @castA(<4 x float> %m) nounwind uwtable readnone ssp {
+; AVX1-LABEL: castA:
+; AVX1:         vxorps %ymm1, %ymm1, %ymm1
+; AVX1-NEXT:    vblendps {{.*#+}} ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]
+; AVX1-NEXT:    retq
Would it be possible to also have a test where the vector insertion is not performed on a zero vector? Apparently all the test cases you modified only seem to test the case case where a vector is inserted in the low 128-bit lane of a zero vector.



More information about the llvm-commits mailing list