[PATCH] D73482: [AMDGPU] Fix lowering a16 image intrinsics

Sebastian Neubauer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 28 00:06:44 PST 2020


sebastian-ne added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.gather4.a16.dim.ll:21
 ; GCN: image_gather4 v[0:3], v[0:1], s[0:7], s[8:11] dmask:0x1 a16 da{{$}}
-define amdgpu_ps <4 x float> @gather4_2darray(<8 x i32> inreg %rsrc, <4 x i32> inreg %samp, half %s, half %t, half %slice) {
 main_body:
----------------
sebastian-ne wrote:
> arsenm wrote:
> > These tests changes seem independent, but I think having the separate scalars has more value for testing the packing code actually works
> Well, before this patch the packing did not work but the tests passed ;)
> To test the packing, I added the new test.
> 
> nhaehnle said the amdgpu_ps calling convention is not build to handle f16 arguments, so it is not clear if they are packed or not?
> The llvm.amdgcn.image.a16.dim.ll test, which takes i16 instead of halfs, also uses packed arguments. It groups all arguments into <2 x i16>s.
I forgot to add, some testcases later down used e.g. v[2:9] as arguments, using the packed arguments ensures that always v[0:…] is used so the test should not be influenced by other optimizations or changes of the compiler.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73482/new/

https://reviews.llvm.org/D73482





More information about the llvm-commits mailing list