[PATCH] D116284: [AMDGPU] Enable divergence-driven 'ctpop' selection

Thu Jan 6 05:15:33 PST 2022

alex-t marked an inline comment as done.
alex-t added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SOPInstructions.td:1374
 def : GCNPat <
-  (i64 (ctpop i64:$src)),
+  (i64 (UniformUnaryFrag<ctpop> i64:$src)),
     (i64 (REG_SEQUENCE SReg_64,
----------------
foad wrote:
> Do we really need both this pattern and the one on line 252? Surely one of them is redundant?
These two are not exactly identical. 
The first one, at line 252, accepts i64 and returns i32.
The second one - accepts i64 and returns i64.
W/o the latter one, no implicit zero extend occurs.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116284/new/

https://reviews.llvm.org/D116284