[all-commits] [llvm/llvm-project] 8e6820: AMDGPU/GlobalISel: Explicitly track d16 for image ...

Mon Jan 10 11:25:26 PST 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 8e682086a067b6cca3034ec5b64ead4b49294685
      https://github.com/llvm/llvm-project/commit/8e682086a067b6cca3034ec5b64ead4b49294685
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2022-01-10 (Mon, 10 Jan 2022)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
    M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
    M llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
    M llvm/lib/Target/AMDGPU/SIInstructions.td
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.load.2d.d16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.store.2d.d16.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.store.2d.ll

  Log Message:
  -----------
  AMDGPU/GlobalISel: Explicitly track d16 for image legalization

We were trying to guess at the original IR type for image intrinsics
after legalization to figure out if they were d16, but this didn't
work. Explicitly track if this is a d16 operation or not in the
opcode, as is done for the buffer intrinsics.

The OpenCL library is using f32 image writes with a dmask of 15 for
some reason, and this was incorrectly switching them to use d16. Fixes
image failures in the OpenCL conformance test. The equivalent dmask
for loads doesn't even select in either selector.