[PATCH] D73482: [AMDGPU] Fix lowering a16 image intrinsics
Sebastian Neubauer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 28 00:06:44 PST 2020
sebastian-ne added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.gather4.a16.dim.ll:21
; GCN: image_gather4 v[0:3], v[0:1], s[0:7], s[8:11] dmask:0x1 a16 da{{$}}
-define amdgpu_ps <4 x float> @gather4_2darray(<8 x i32> inreg %rsrc, <4 x i32> inreg %samp, half %s, half %t, half %slice) {
main_body:
----------------
sebastian-ne wrote:
> arsenm wrote:
> > These tests changes seem independent, but I think having the separate scalars has more value for testing the packing code actually works
> Well, before this patch the packing did not work but the tests passed ;)
> To test the packing, I added the new test.
>
> nhaehnle said the amdgpu_ps calling convention is not build to handle f16 arguments, so it is not clear if they are packed or not?
> The llvm.amdgcn.image.a16.dim.ll test, which takes i16 instead of halfs, also uses packed arguments. It groups all arguments into <2 x i16>s.
I forgot to add, some testcases later down used e.g. v[2:9] as arguments, using the packed arguments ensures that always v[0:…] is used so the test should not be influenced by other optimizations or changes of the compiler.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D73482/new/
https://reviews.llvm.org/D73482
More information about the llvm-commits
mailing list