[PATCH] D37348: Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 14 12:41:41 PDT 2017
arsenm added a comment.
Missing performCtlzCombine equivalent
================
Comment at: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:2803-2806
// If ISD::CTLZ is legal and CTPOP isn't, then do that instead.
if (!TLI.isOperationLegalOrCustom(ISD::CTPOP, VT) &&
TLI.isOperationLegalOrCustom(ISD::CTLZ, VT))
return DAG.getNode(ISD::SUB, dl, VT,
----------------
arsenm wrote:
> OK, I see the default expansion here isn't the compare and select like I expected. Since the compare+select implementation is likely more instructions with the compare than the sub/ctpop implementation, that one should be tried first.
I don't see this changed
================
Comment at: test/CodeGen/AMDGPU/cttz_zero_undef.ll:109
+; EG: MEM_RAT_CACHELESS STORE_RAW [[RESULT:T[0-9]+\.[XYZW]]]
+define amdgpu_kernel void @v_cttz_zero_undef_i64_with_select(i64 addrspace(1)* noalias %out, i64 addrspace(1)* nocapture readonly %arrayidx) nounwind {
+ %val = load i64, i64 addrspace(1)* %arrayidx, align 1
----------------
Missing scalar version
Repository:
rL LLVM
https://reviews.llvm.org/D37348
More information about the llvm-commits
mailing list