[PATCH] D37348: Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 14 12:41:41 PDT 2017


arsenm added a comment.

Missing performCtlzCombine equivalent



================
Comment at: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:2803-2806
     // If ISD::CTLZ is legal and CTPOP isn't, then do that instead.
     if (!TLI.isOperationLegalOrCustom(ISD::CTPOP, VT) &&
         TLI.isOperationLegalOrCustom(ISD::CTLZ, VT))
       return DAG.getNode(ISD::SUB, dl, VT,
----------------
arsenm wrote:
> OK, I see the default expansion here isn't the compare and select like I expected. Since the compare+select implementation is likely more instructions with the compare than the sub/ctpop implementation, that one should be tried first.
I don't see this changed


================
Comment at: test/CodeGen/AMDGPU/cttz_zero_undef.ll:109
+; EG: MEM_RAT_CACHELESS STORE_RAW [[RESULT:T[0-9]+\.[XYZW]]]
+define amdgpu_kernel void @v_cttz_zero_undef_i64_with_select(i64 addrspace(1)* noalias %out, i64 addrspace(1)* nocapture readonly %arrayidx) nounwind {
+  %val = load i64, i64 addrspace(1)* %arrayidx, align 1
----------------
Missing  scalar version


Repository:
  rL LLVM

https://reviews.llvm.org/D37348





More information about the llvm-commits mailing list