[PATCH] D116044: [AMDGPU] Enable devergence predicates for ctlz/cttz

Alexander via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 20 08:10:50 PST 2021


alex-t created this revision.
alex-t added a reviewer: rampitec.
Herald added subscribers: foad, kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl, arsenm.
alex-t requested review of this revision.
Herald added a subscriber: wdng.
Herald added a project: LLVM.

ctlz/cttz get lowered to the set of target opcodes

  This change enables the ISel to select SALU or VALU form acording to the SDNode divergence.
  CTLZ - S_FLBIT_I32_B32 if uniform and V_FFBH_U32_e64 if divergent
  CTTZ - S_FF1_I32_B32   if uniform and V_FFBL_B32_e64 if divergent
  Also @llvm.amdgcn.sffbh.i32 gets lowered to S_FLBIT_I32 if uniform and V_FFBH_I32_e64 if divergent
  NOTE: 64bit versions S_FF1_I32_B64 and S_FLBIT_I32_B64 are not currently supported by the DAG ISel
  ctlz/cttz with i64 input get splitted to two 32bit instructions. Newertheless, they already have the patterns
  and were equipped with the divergence predicates to make sure they will be selected correctly when enabled.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D116044

Files:
  llvm/lib/Target/AMDGPU/SOPInstructions.td
  llvm/test/CodeGen/AMDGPU/divergence-driven-ctlz-cttz.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D116044.395457.patch
Type: text/x-patch
Size: 4138 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211220/f8a3a491/attachment.bin>


More information about the llvm-commits mailing list