[PATCH] D37348: Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.

Fri Sep 15 10:17:56 PDT 2017

arsenm added inline comments.

================
Comment at: lib/Target/AMDGPU/AMDGPUISelLowering.cpp:2036-2042
+  if (isCtlzOpc(Op.getOpcode())) {
+	ISDOpc = ISD::CTLZ_ZERO_UNDEF;
+    AMDGPUISDOpc = AMDGPUISD::FFBH_U32;
+  } else if (isCttzOpc(Op.getOpcode())){
+	ISDOpc = ISD::CTTZ_ZERO_UNDEF;
+    AMDGPUISDOpc = AMDGPUISD::FFBL_U32;
+  } else
----------------
Indentation wrong

================
Comment at: lib/Target/AMDGPU/AMDGPUISelLowering.cpp:2043
+  } else
+    assert("Invalid OPCode!!!");
+
----------------
llvm_unreachable

================
Comment at: lib/Target/AMDGPU/AMDGPUISelLowering.cpp:2061-2064
+  if (isCtlzOpc(Op.getOpcode()))
+    Hi0 = DAG.getSetCC(SL, SetCCVT, Hi, Zero, ISD::SETEQ);
+  else
+    Hi0 = DAG.getSetCC(SL, SetCCVT, Hi, One, ISD::SETEQ);
----------------
Select between Zero and One as input to getSetCC

================
Comment at: lib/Target/AMDGPU/AMDGPUISelLowering.cpp:3008-3009
+                                          unsigned Opc) const {
+  unsigned AMDGPUISDOpc = isCttzOpc(Opc) ? AMDGPUISD::FFBL_U32 :
+                                           AMDGPUISD::FFBH_U32;
+//unsigned AMDGPUISDOpc = AMDGPUISD::FFBH_U32;
----------------
You could just pass in the new opcode directly rather than selecting it again

================
Comment at: lib/Target/AMDGPU/AMDGPUISelLowering.cpp:3010
+                                           AMDGPUISD::FFBH_U32;
+//unsigned AMDGPUISDOpc = AMDGPUISD::FFBH_U32;
+
----------------
Commented out code

================
Comment at: lib/Target/AMDGPU/AMDGPUISelLowering.cpp:3035
 // TODO: Should probably combine against FFBH_U32 instead of ctlz directly.
-SDValue AMDGPUTargetLowering::performCtlzCombine(const SDLoc &SL, SDValue Cond,
+SDValue AMDGPUTargetLowering::performCtlz_CttzCombine(const SDLoc &SL, SDValue Cond,
                                                  SDValue LHS, SDValue RHS,
----------------
You didn't add tests for this part

================
Comment at: lib/Target/AMDGPU/AMDGPUISelLowering.h:374
   FFBH_I32,
+  FFBL_U32, // cttz with -1 if input is zero.
   MUL_U24,
----------------
Should be name FFBL_B32 to match the instruction

================
Comment at: test/CodeGen/AMDGPU/cttz_zero_undef.ll:131
+}
+
----------------
Also should have some tests with i8/i16

https://reviews.llvm.org/D37348