[PATCH] D43608: [X86] Use setcc ISD opcode for AVX512 integer comparisons all the way to isel

Wed Feb 21 21:44:35 PST 2018

craig.topper added inline comments.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:6934
         BOperand = Ld.getOperand(0).getOperand(0);
-      if (BOperand.getValueType().isVector() &&
-          BOperand.getSimpleValueType().getVectorElementType() == MVT::i1) {
-        if ((EltType == MVT::i64 && (VT.getVectorElementType() == MVT::i8 ||
-                                     NumElts == 8)) || // for broadcastmb2q
-            (EltType == MVT::i32 && (VT.getVectorElementType() == MVT::i16 ||
-                                     NumElts == 16))) { // for broadcastmw2d
-          SDValue Brdcst =
-              DAG.getNode(X86ISD::VBROADCASTM, dl,
-                          MVT::getVectorVT(EltType, NumElts), BOperand);
-          return DAG.getBitcast(VT, Brdcst);
-        }
+      MVT MaskVT = BOperand.getSimpleValueType();
+      if ((EltType == MVT::i64 && MaskVT == MVT::v8i1) || // for broadcastmb2q
----------------
This isn't strictly related but is needed to prevent a regression. The handling for the case when "ZeroExtended" is null and "Ld" is non-null doesn't work correctly.

In the Ld case, the VT.getVectorElementType() and EltType are the same. But we need the element type we're zero extending from.

I think previously target independent DAG combine ran reduceBuildVecExtToExtBuildVec after setcc was turned into pcmpm. This pushed the code to use "ZeroExtended" case. But now we don't change the setcc, so the target independent DAG combine doesn't get a chance to run. Leaving us with the "Ld" case.

================
Comment at: lib/Target/X86/X86InstrAVX512.td:3180
 let Predicates = [HasAVX512, NoVLX] in {
+  let AddedComplexity = 2 in {
   defm : axv512_icmp_packed_no_vlx_lowering<X86pcmpgtm, "VPCMPGTD", v8i32x_info, v16i32_info>;
----------------
This is needed because the explicit CondCode match doesn't seem to increase the pattern complexity the way an explicit CondCode match does.

================
Comment at: test/CodeGen/X86/broadcastm-lowering.ll:143
 ; AVX512CD-NEXT:    kmovw %k0, %eax
-; AVX512CD-NEXT:    vpxor %xmm0, %xmm0, %xmm0
-; AVX512CD-NEXT:    vpinsrb $0, %eax, %xmm0, %xmm0
-; AVX512CD-NEXT:    vpinsrb $8, %eax, %xmm0, %xmm0
-; AVX512CD-NEXT:    vinserti128 $1, %xmm0, %ymm0, %ymm0
+; AVX512CD-NEXT:    movzbl %al, %eax
+; AVX512CD-NEXT:    vmovq %rax, %xmm0
----------------
I'm not sure why this changed. I think its because vector lowering didn't change anything. It used to change setcc to pcmpm. But it doesn't now, so the post vector lowering DAG combine doesn't run. And there's a target independent DAG combine, reduceBuildVecExtToExtBuildVec, that is only allowed to run after legalize vector ops. And if the post vector ops dag combine doesn't run, then the build vector is lowered before the last DAG combine runs so reduceBuildVecExtToExtBuildVec doesn't get called.

https://reviews.llvm.org/D43608