[PATCH] D144198: [AMDGPU] Check exact width in get*ClassForBitWidth
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 16 08:51:58 PST 2023
foad added reviewers: AMDGPU, arsenm.
foad added a comment.
This is an RFC to check my intuition that we really want exact matches here - there should not be any cases where we return a wider class than was asked for. In particular, we have SGPR classes corresponding to every VGPR class width.
Currently this causes a couple of test failures:
LLVM :: CodeGen/AMDGPU/GlobalISel/extractelement.ll
LLVM :: CodeGen/AMDGPU/GlobalISel/insertelement.ll
The failure mode is like:
LLVM ERROR: cannot select: %29:sreg_32(s32), %30:sreg_32(s32), %31:sreg_32(s32), %32:sreg_32(s32), %33:sreg_32(s32), %34:sreg_32(s32), %35:sreg_32(s32), %36:sreg_32(s32), %37:sreg_32(s32), %38:sreg_32(s32), %39:sreg_32(s32), %40:sreg_32(s32), %41:sreg_32(s32), %42:sreg_32(s32) = G_UNMERGE_VALUES %28:sgpr(<7 x s64>) (in function: dyn_insertelement_v7f64_s_s_s)
I guess this a globalisel legalization problem. We don't have any 448-bit register classes corresponding to the `<7 x s64>` on the RHS of this instruction, so maybe legalization should have widened it to `<8 x s64>`?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D144198/new/
https://reviews.llvm.org/D144198
More information about the llvm-commits
mailing list