[all-commits] [llvm/llvm-project] c3a741: AMDGPU/GlobalISel: Fix legalization failure for s6...

Matt Arsenault via All-commits all-commits at lists.llvm.org
Mon Jan 17 07:04:54 PST 2022


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: c3a74183a52f5cecc19f947187921866665ef279
      https://github.com/llvm/llvm-project/commit/c3a74183a52f5cecc19f947187921866665ef279
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2022-01-17 (Mon, 17 Jan 2022)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/ashr.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ashr.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-lshr.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-shl.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/shl.ll

  Log Message:
  -----------
  AMDGPU/GlobalISel: Fix legalization failure for s65 shifts

This was trying to clamp s65 down to s32, which wasn't handled so we
need to promote all the way to s128 first. Having to order the
legalization rules in just the right way is rather dissatisfying, but
I'm not sure how smart the legalizer should be in trying to interpret
the rules.


  Commit: 0b1140e883527101599e09462d8aa353cdec1903
      https://github.com/llvm/llvm-project/commit/0b1140e883527101599e09462d8aa353cdec1903
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2022-01-17 (Mon, 17 Jan 2022)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
    M llvm/lib/Target/AMDGPU/GCNSubtarget.h

  Log Message:
  -----------
  AMDGPU: Correct getMaxNumSGPR treatment of flat_scratch

This was approximating the entry point logic for flat_scratch_init,
which is not really the point. We need to account for whether we need
to reserve the SGPR pair used for flat_scratch, not whether we needed
the initialization kernel argument. If this was an arbitrary function,
we would end up over-reporting the number of potentially free
SGPRs. The logic for architected flat scratch also only applies to the
initialization in the kernel, not the reserved registers at the end.

Avoids compile failures in a future patch from allocating more SGPRs
than the subtarget supports.


Compare: https://github.com/llvm/llvm-project/compare/95bf5ac8a827...0b1140e88352


More information about the All-commits mailing list