[PATCH] D116284: [AMDGPU] Enable divergence-driven 'ctpop' selection
Alexander via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 6 05:15:33 PST 2022
alex-t marked an inline comment as done.
alex-t added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SOPInstructions.td:1374
def : GCNPat <
- (i64 (ctpop i64:$src)),
+ (i64 (UniformUnaryFrag<ctpop> i64:$src)),
(i64 (REG_SEQUENCE SReg_64,
----------------
foad wrote:
> Do we really need both this pattern and the one on line 252? Surely one of them is redundant?
These two are not exactly identical.
The first one, at line 252, accepts i64 and returns i32.
The second one - accepts i64 and returns i64.
W/o the latter one, no implicit zero extend occurs.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D116284/new/
https://reviews.llvm.org/D116284
More information about the llvm-commits
mailing list