[PATCH] D37348: Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.

Fri Sep 8 13:22:12 PDT 2017

arsenm added inline comments.

================
Comment at: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:2784-2793
+    unsigned len = VT.getSizeInBits();
+
+    if (TLI.isOperationLegalOrCustom(ISD::CTTZ_ZERO_UNDEF, VT)) {
+      EVT SetCCVT = getSetCCResultType(VT);
+      SDValue CTTZ = DAG.getNode(ISD::CTTZ_ZERO_UNDEF, dl, VT, Op);
+      SDValue Zero = DAG.getConstant(0, dl, VT);
+      SDValue SrcIsZero = DAG.getSetCC(dl, SetCCVT, Op, Zero, ISD::SETEQ);
----------------
I don't understand why you have this or most of the other changes. This shouldn't be substantially different from how we handle ctlz already. i.e. I would expect to see another version of AMDGPUTargetLowering::performCtlzCombine that does essentially the same thing for CTTZ.

================
Comment at: lib/Target/AMDGPU/AMDGPUISelLowering.cpp:420
   if (Subtarget->hasFFBL())
-    setOperationAction(ISD::CTTZ_ZERO_UNDEF, MVT::i32, Legal);

----------------
This should definitely remain legal

================
Comment at: lib/Target/AMDGPU/AMDGPUISelLowering.cpp:2029
+
+  SDValue Vec = DAG.getNode(ISD::BITCAST, SL, MVT::v2i32, Src);
+
----------------
You didn't enable custom lowering for i64, so this is dead code. You also didn't a dd any tests for it. In either case, it should be in a separate patch from the i32 handling.

================
Comment at: lib/Target/AMDGPU/AMDGPUInstrInfo.td:301-302

+def AMDGPUffbl_u32 : SDNode<"AMDGPUISD::FFBL_U32", SDTIntUnaryOp>;
+def AMDGPUffbl_i32 : SDNode<"AMDGPUISD::FFBL_I32", SDTIntUnaryOp>;
+
----------------
This isn't a signed/unsigned operation. There is just one v_ffbl_b32.

Repository:
  rL LLVM

https://reviews.llvm.org/D37348