[PATCH] D116284: [AMDGPU] Enable divergence-driven 'ctpop' selection
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 6 05:49:39 PST 2022
foad accepted this revision.
foad added a comment.
This revision is now accepted and ready to land.
LGTM.
================
Comment at: llvm/lib/Target/AMDGPU/SOPInstructions.td:1374
def : GCNPat <
- (i64 (ctpop i64:$src)),
+ (i64 (UniformUnaryFrag<ctpop> i64:$src)),
(i64 (REG_SEQUENCE SReg_64,
----------------
alex-t wrote:
> foad wrote:
> > Do we really need both this pattern and the one on line 252? Surely one of them is redundant?
> These two are not exactly identical.
> The first one, at line 252, accepts i64 and returns i32.
> The second one - accepts i64 and returns i64.
> W/o the latter one, no implicit zero extend occurs.
Actually I see now, the i64 to i32 pattern is used for GlobalISel only, and the i64 to i64 pattern is used for SelectionDAG only.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D116284/new/
https://reviews.llvm.org/D116284
More information about the llvm-commits
mailing list