[PATCH] D144198: [AMDGPU] Check exact width in get*ClassForBitWidth

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 16 08:51:58 PST 2023


foad added reviewers: AMDGPU, arsenm.
foad added a comment.

This is an RFC to check my intuition that we really want exact matches here - there should not be any cases where we return a wider class than was asked for. In particular, we have SGPR classes corresponding to every VGPR class width.

Currently this causes a couple of test failures:

  LLVM :: CodeGen/AMDGPU/GlobalISel/extractelement.ll
  LLVM :: CodeGen/AMDGPU/GlobalISel/insertelement.ll

The failure mode is like:

  LLVM ERROR: cannot select: %29:sreg_32(s32), %30:sreg_32(s32), %31:sreg_32(s32), %32:sreg_32(s32), %33:sreg_32(s32), %34:sreg_32(s32), %35:sreg_32(s32), %36:sreg_32(s32), %37:sreg_32(s32), %38:sreg_32(s32), %39:sreg_32(s32), %40:sreg_32(s32), %41:sreg_32(s32), %42:sreg_32(s32) = G_UNMERGE_VALUES %28:sgpr(<7 x s64>) (in function: dyn_insertelement_v7f64_s_s_s)

I guess this a globalisel legalization problem. We don't have any 448-bit register classes corresponding to the `<7 x s64>` on the RHS of this instruction, so maybe legalization should have widened it to `<8 x s64>`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144198/new/

https://reviews.llvm.org/D144198



More information about the llvm-commits mailing list