[all-commits] [llvm/llvm-project] 40bc91: GlobalISel: Relax handling of G_ASSERT_* with sour...

Fri Apr 22 07:50:08 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 40bc9112c079cfdbaa051e55833ec24f64d981a4
      https://github.com/llvm/llvm-project/commit/40bc9112c079cfdbaa051e55833ec24f64d981a4
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2022-04-22 (Fri, 22 Apr 2022)

  Changed paths:
    M llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp
    M llvm/lib/CodeGen/MachineVerifier.cpp
    A llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-assert-zext.mir
    M llvm/test/MachineVerifier/test_g_assert_sext_register_bank_class.mir
    M llvm/test/MachineVerifier/test_g_assert_zext_register_bank_class.mir

  Log Message:
  -----------
  GlobalISel: Relax handling of G_ASSERT_* with source register classes

The most common situation where G_ASSERT_ZEXT appears for AMDGPU is a
copy from a physical register, which happens to use set the actual
register class on the virtual register. After copy coalescing, the
assert's source operand had a vreg with a set class. The verifier was
strictly rejecting cases where the set class/bank weren't an exact
match. Additionally, RegBankSelect was also expecting a register bank
to be set on the register, not a class.

This is much stricter than regular copies so relax this behavior. This
now allows these 2 cases:

1. Source register has either class or bank, and the result does not
2. Source register has a register class, and the result is a register
with a matching bank.

This should avoid needing some kind of special handling to avoid
violating this constraint when folding copies.

  Commit: 794a0bb547484ec33c13bd6c7c04b1dbd03d040a
      https://github.com/llvm/llvm-project/commit/794a0bb547484ec33c13bd6c7c04b1dbd03d040a
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2022-04-22 (Fri, 22 Apr 2022)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
    M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
    M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
    M llvm/lib/Target/AMDGPU/AMDGPULowerIntrinsics.cpp
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/lib/Target/AMDGPU/SIISelLowering.h
    A llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-amdgcn.workitem.id.mir
    M llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll
    M llvm/test/CodeGen/AMDGPU/flat-scratch-svs.ll
    M llvm/test/CodeGen/AMDGPU/memory_clause.ll
    M llvm/test/CodeGen/AMDGPU/zext-lid.ll

  Log Message:
  -----------
  AMDGPU: Directly implement computeKnownBits for workitem intrinsics

Currently metadata is inserted in a late pass which is lowered
to an AssertZext. The metadata would be more useful if it was
inserted earlier after inlining, but before codegen.

Probably shouldn't change anything now. Just replacing the
late metadata annotation needs more work, since we lose
out on optimizations after these are lowered to CopyFromReg.

Seems to be slightly better than relying on the AssertZext from the
metadata. The test change in cvt_f32_ubyte.ll is a quirk from it using
-start-before=amdgpu-isel instead of running the usual codegen
pipeline.

Compare: https://github.com/llvm/llvm-project/compare/369ef9bf6056...794a0bb54748