[PATCH] D116044: [AMDGPU] Enable devergence predicates for ctlz/cttz
Alexander via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 20 08:10:50 PST 2021
alex-t created this revision.
alex-t added a reviewer: rampitec.
Herald added subscribers: foad, kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl, arsenm.
alex-t requested review of this revision.
Herald added a subscriber: wdng.
Herald added a project: LLVM.
ctlz/cttz get lowered to the set of target opcodes
This change enables the ISel to select SALU or VALU form acording to the SDNode divergence.
CTLZ - S_FLBIT_I32_B32 if uniform and V_FFBH_U32_e64 if divergent
CTTZ - S_FF1_I32_B32 if uniform and V_FFBL_B32_e64 if divergent
Also @llvm.amdgcn.sffbh.i32 gets lowered to S_FLBIT_I32 if uniform and V_FFBH_I32_e64 if divergent
NOTE: 64bit versions S_FF1_I32_B64 and S_FLBIT_I32_B64 are not currently supported by the DAG ISel
ctlz/cttz with i64 input get splitted to two 32bit instructions. Newertheless, they already have the patterns
and were equipped with the divergence predicates to make sure they will be selected correctly when enabled.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D116044
Files:
llvm/lib/Target/AMDGPU/SOPInstructions.td
llvm/test/CodeGen/AMDGPU/divergence-driven-ctlz-cttz.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D116044.395457.patch
Type: text/x-patch
Size: 4138 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211220/f8a3a491/attachment.bin>
More information about the llvm-commits
mailing list