[PATCH] D116284: [AMDGPU] Enable divergence-driven 'ctpop' selection

Alexander via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 6 05:15:33 PST 2022


alex-t marked an inline comment as done.
alex-t added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SOPInstructions.td:1374
 def : GCNPat <
-  (i64 (ctpop i64:$src)),
+  (i64 (UniformUnaryFrag<ctpop> i64:$src)),
     (i64 (REG_SEQUENCE SReg_64,
----------------
foad wrote:
> Do we really need both this pattern and the one on line 252? Surely one of them is redundant?
These two are not exactly identical. 
The first one, at line 252, accepts i64 and returns i32.
The second one - accepts i64 and returns i64.
W/o the latter one, no implicit zero extend occurs.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116284/new/

https://reviews.llvm.org/D116284



More information about the llvm-commits mailing list