[PATCH] D73482: [AMDGPU] Fix lowering a16 image intrinsics

Tue Jan 28 06:41:12 PST 2020

rtaylor added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.a16.encode.ll:3
+; RUN: llc -march=amdgcn -mcpu=gfx900 -verify-machineinstrs < %s | FileCheck -check-prefixes=GFX9 %s
+
+define amdgpu_ps <4 x float> @load_3d(<8 x i32> inreg %rsrc, <4 x i32> inreg %smpl, float %sf, float %tf) {
----------------
sebastian-ne wrote:
> arsenm wrote:
> > if this is an encoding test, it should use -show-mc-encoding?
> Maybe the name was chosen uncarefully, this test should ensure that all components are actually converted and packed into one register.
> 
> Before this patch, a16 is not working because the y component just gets ignored, the compiler only ensures that the x component is present.
> Still, all other a16 tests are passing. So I tried to introduce this test as a regression test.
> It should make sure that actually all components are "encoded" into the packed coordinates.
> 
> I guess renaming it to llvm.amdgcn.image.a16.pack.ll would make more sense?
Should we be checking the encoding for all existing tests instead?

================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.gather4.a16.dim.ll:5
 ; GCN: image_gather4 v[0:3], v0, s[0:7], s[8:11] dmask:0x1 a16{{$}}
-define amdgpu_ps <4 x float> @gather4_2d(<8 x i32> inreg %rsrc, <4 x i32> inreg %samp, half %s, half %t) {
+define amdgpu_ps <4 x float> @gather4_2d(<8 x i32> inreg %rsrc, <4 x i32> inreg %samp, <2 x half> %args) {
 main_body:
----------------
What is the point of combining the halves into vectors? I think this makes the test code unnecessarily complex and less readable.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73482/new/

https://reviews.llvm.org/D73482