[PATCH] D37348: Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 11 11:05:09 PDT 2017
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/AMDGPUISelLowering.cpp:2031
+
+SDValue AMDGPUTargetLowering:: LowerCTLZ_CTTZ(SDValue Op, SelectionDAG &DAG) const {
SDLoc SL(Op);
----------------
Extra space after ::
================
Comment at: lib/Target/AMDGPU/AMDGPUISelLowering.cpp:2040
+ ISDOpc = ISD::CTLZ_ZERO_UNDEF;
+ AMDGPUISDOpc = AMDGPUISD::FFBH_U32;
+ } else if (isCttzOpc(Op.getOpcode())){
----------------
Don't includ eAMDGPUISD in variable name
================
Comment at: lib/Target/AMDGPU/AMDGPUISelLowering.cpp:2041
+ AMDGPUISDOpc = AMDGPUISD::FFBH_U32;
+ } else if (isCttzOpc(Op.getOpcode())){
+ ISDOpc = ISD::CTTZ_ZERO_UNDEF;
----------------
Missing space before {
================
Comment at: lib/Target/AMDGPU/AMDGPUISelLowering.cpp:2073
+ Add = DAG.getNode(ISD::ADD, SL, MVT::i32, OprLo, Bits32);
+ //// ctlz(x) = hi_32(x) == 0 ? ctlz(lo_32(x)) + 32 : ctlz(hi_32(x)
+ NewOpr = DAG.getNode(ISD::SELECT, SL, MVT::i32, Hi0orLo0, Add, OprHi);
----------------
Double // and missing closing )
================
Comment at: lib/Target/AMDGPU/AMDGPUISelLowering.cpp:2077
+ Add = DAG.getNode(ISD::ADD, SL, MVT::i32, OprHi, Bits32);
+ //// cttz(x) = lo_32(x) == 0 ? cttz(hi_32(x)) + 32 : cttz(lo_32(x))
+ NewOpr = DAG.getNode(ISD::SELECT, SL, MVT::i32, Hi0orLo0, Add, OprLo);
----------------
Double //
================
Comment at: test/CodeGen/AMDGPU/cttz_zero_undef.ll:1
-; RUN: llc -march=amdgcn -verify-machineinstrs < %s | FileCheck -check-prefix=SI -check-prefix=FUNC %s
-; RUN: llc -march=amdgcn -mcpu=tonga -mattr=-flat-for-global -verify-machineinstrs < %s | FileCheck -check-prefix=SI -check-prefix=FUNC %s
+; RUN: llc -march=amdgcn -verify-machineinstrs < %s | FileCheck -check-prefix=SI -check-prefix=SI-NOSDWA -check-prefix=FUNC %s
+; RUN: llc -march=amdgcn -mcpu=tonga -mattr=-flat-for-global -verify-machineinstrs < %s | FileCheck -check-prefix=SI -check-prefix=SI-SDWA -check-prefix=FUNC %s
----------------
Add -enable-var-scope to all of the FileCheck lines. Several of these tests are broken
================
Comment at: test/CodeGen/AMDGPU/cttz_zero_undef.ll:171-172
+; FUNC-LABEL: {{^}}v_cttz_zero_undef_i64_with_select:
+; SI: v_ffbl_b32_e32 v{{[0-9]+}}, v{{[0-9]+}}
+; SI: v_ffbl_b32_e32 v{{[0-9]+}}, v{{[0-9]+}}
+; EG: MEM_RAT_CACHELESS STORE_RAW [[RESULT:T[0-9]+\.[XYZW]]]
----------------
This isn't checking the outputs and select
================
Comment at: test/CodeGen/AMDGPU/cttz_zero_undef.ll:184
+; FUNC-LABEL: {{^}}v_cttz_i32_sel_eq_neg1:
+; SI: v_ffbl_b32_e32 [[RESULT:v[0-9]+]], [[VAL]]
+; SI: s_endpgm
----------------
Using undefined VAL
================
Comment at: test/CodeGen/AMDGPU/cttz_zero_undef.ll:198
+; FUNC-LABEL: {{^}}v_cttz_i32_sel_ne_neg1:
+; SI: v_ffbl_b32_e32 [[RESULT:v[0-9]+]], [[VAL]]
+; SI: s_endpgm
----------------
Undefined VAL
Repository:
rL LLVM
https://reviews.llvm.org/D37348
More information about the llvm-commits
mailing list