[PATCH] D116284: [AMDGPU] Enable divergence-driven 'ctpop' selection
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Dec 31 03:40:35 PST 2021
foad added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:1028
+ (i32 (V_BCNT_U32_B32_e64 (i32 (EXTRACT_SUBREG i64:$src, sub0)), (i32 0)))), sub0,
+ (i32 (COPY_TO_REGCLASS (i32 (V_MOV_B32_e32 (i32 0))), VGPR_32)), sub1)
+>;
----------------
Do you really need COPY_TO_REGCLASS here?
================
Comment at: llvm/lib/Target/AMDGPU/SOPInstructions.td:1374
def : GCNPat <
- (i64 (ctpop i64:$src)),
+ (i64 (UniformUnaryFrag<ctpop> i64:$src)),
(i64 (REG_SEQUENCE SReg_64,
----------------
Do we really need both this pattern and the one on line 252? Surely one of them is redundant?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D116284/new/
https://reviews.llvm.org/D116284
More information about the llvm-commits
mailing list