[PATCH] D74314: AMDGPU/GlobalISel: Fix asserting on gather4 intrinsics

Sun Feb 16 10:00:48 PST 2020

nhaehnle added a comment.

LGTM, except for one thing mentioned in the comments (which other code may also be getting wrong still, but oh well...).

================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.gather4.a16.dim.ll:499
+main_body:
+  %v = call <4 x float> @llvm.amdgcn.image.gather4.b.2d.v4f32.f32.f16(i32 1, float %bias, half %s, half %t, <8 x i32> %rsrc, <4 x i32> %samp, i1 false, i32 0, i32 0)
+  ret <4 x float> %v
----------------
Can you please change this test to make %bias a half as well?

We recently double-checked the docs, and the rule is: when A16 is set in the encoding, the %bias address still occupies a full 32-bit register, but only the lower 16 bits are meaningful, and they're interpreted as a half-float.

(The same is *not* true of the zcompare or offset address operands that the _c and _o variants have.)

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74314/new/

https://reviews.llvm.org/D74314