[PATCH] D11518: [DAGCombiner] Convert constant AND masks to shuffle clear masks down to the byte level
Chandler Carruth
chandlerc at gmail.com
Sun Jul 26 22:09:13 PDT 2015
chandlerc added a comment.
By and large, really, really nice.
Comments in-line.
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12991-12994
@@ -12990,8 +12990,6 @@
- for (unsigned i = 0; i != NumElts; ++i) {
- SDValue Elt = RHS.getOperand(i);
- if (isAllOnesConstant(Elt))
- Indices.push_back(i);
- else if (isNullConstant(Elt))
- Indices.push_back(NumElts+i);
- else
+ // Attempt to create a valid clear mask, splitting the mask into
+ // sub elements and checking to see if each is
+ // all zeros or all ones - suitable for shuffle masking.
+ auto BuildClearMask = [&](unsigned Split) {
+ unsigned NumSubElts = NumElts * Split;
----------------
How annoying would it be to fully split this out into a helper function? The indentation is already reasonably significant here.
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12999-13001
@@ +12998,5 @@
+ SmallVector<int, 8> Indices;
+ for (unsigned i = 0; i != NumSubElts; ++i) {
+ unsigned EltIdx = i / Split;
+ unsigned SubIdx = i % Split;
+ SDValue Elt = RHS.getOperand(EltIdx);
----------------
I would just consistently use 'int' here.
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:13118-13121
@@ -13079,2 +13117,6 @@
+ // Try to convert a constant mask AND into a shuffle clear mask.
+ if (SDValue Shuffle = XformToShuffleWithZero(N))
+ return Shuffle;
+
// Type legalization might introduce new shuffles in the DAG.
----------------
Is there a particular reason to need to adjust this in this way?
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:6884-6885
@@ +6883,4 @@
+
+ if (SDValue Masked = lowerVectorShuffleAsBitMask(DL, VT, V1, V2, Mask, DAG))
+ return Masked;
+
----------------
Please send a separate review with just this (the bit mask lowering) change?
It seems likely it should be testable independently.
================
Comment at: test/CodeGen/X86/sse3.ll:272-275
@@ -271,4 +271,6 @@
; X64: ## BB#0: ## %entry
-; X64-NEXT: movddup {{.*#+}} xmm0 = mem[0,0]
-; X64-NEXT: andpd {{.*}}(%rip), %xmm0
+; X64-NEXT: movaps (%rax), %xmm0
+; X64-NEXT: unpcklps {{.*#+}} xmm0 = xmm0[0,0,1,1]
+; X64-NEXT: pxor %xmm1, %xmm1
+; X64-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
; X64-NEXT: retq
----------------
This seems like a pretty bad regression. Any hope of recovering it?
================
Comment at: test/CodeGen/X86/vec_cast2.ll:49-54
@@ -48,6 +48,8 @@
; CHECK: ## BB#0:
-; CHECK-NEXT: vpmovzxwd {{.*#+}} xmm1 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
-; CHECK-NEXT: vpunpckhwd {{.*#+}} xmm0 = xmm0[4,4,5,5,6,6,7,7]
-; CHECK-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
-; CHECK-NEXT: vandps LCPI2_0, %ymm0, %ymm0
+; CHECK-NEXT: vpunpckhwd {{.*#+}} xmm1 = xmm0[4,4,5,5,6,6,7,7]
+; CHECK-NEXT: vmovdqa {{.*#+}} xmm2 = [255,0,0,0,255,0,0,0,255,0,0,0,255,0,0,0]
+; CHECK-NEXT: vpand %xmm2, %xmm1, %xmm1
+; CHECK-NEXT: vpmovzxwd {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
+; CHECK-NEXT: vpand %xmm2, %xmm0, %xmm0
+; CHECK-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
; CHECK-NEXT: vcvtdq2ps %ymm0, %ymm0
----------------
Same question as above...
================
Comment at: test/CodeGen/X86/vec_int_to_fp.ll:1762-1765
@@ -1771,2 +1761,6 @@
; AVX1-NEXT: vpunpckhwd {{.*#+}} xmm1 = xmm1[4,4,5,5,6,6,7,7]
+; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [255,0,0,0,255,0,0,0,255,0,0,0,255,0,0,0]
+; AVX1-NEXT: vpand %xmm2, %xmm1, %xmm1
+; AVX1-NEXT: vpmovzxbd {{.*#+}} xmm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero
+; AVX1-NEXT: vpand %xmm2, %xmm0, %xmm0
; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
----------------
and here, wow!
Repository:
rL LLVM
http://reviews.llvm.org/D11518
More information about the llvm-commits
mailing list