[llvm] [AMDGPU] Optimize image sample followed by llvm.amdgcn.cvt.pkrtz into d16 variant (PR #145203)

Russell Liu via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 3 09:29:05 PDT 2025


================
@@ -247,6 +247,42 @@ simplifyAMDGCNImageIntrinsic(const GCNSubtarget *ST,
                                        ArgTys[0] = User->getType();
                                      });
         }
+
+        // Fold image.sample + cvt.pkrtz -> extractelement idx0 into a single
+        // d16 image sample.
----------------
GinShio wrote:

I agree we should deal with carefully.

> This is exactly the issue I mentioned. For fixed point formats, the data is not converted to fp32 and then truncated to fp16, it's directly rounded to fp16.

Not sure if the failed case is `piglit/arb_texture_view-rendering-formats`. [_fs*\_float16_](https://gitlab.freedesktop.org/mesa/piglit/-/blob/main/tests/spec/arb_texture_view/rendering-formats.c?ref_type=heads#L287-L288) is similar to this pattern:
 + the texture format is 16f,
 + sampled type is float,
 + result is used by `PackHalf2x16`.

> But then if it's OK to use the normal round-to-nearest-even mode then why would you generate pkrtz instructions in the first place?

I guess it's generated by SPIR-V opcode `PackHalf2x16`, to fix VKD3D-Proton issue.

References:
 + https://github.com/KhronosGroup/Vulkan-Docs/issues/1825
 + https://github.com/GPUOpen-Drivers/llpc/blob/40cb8d95ad8d6f7f1652e3fd47d39667594cce08/llpc/translator/lib/SPIRV/SPIRVReader.cpp#L10609-L10614
 + Direct3D11 Functional Specification 3.2.1 Floating Point Conversion
   > Round-to-zero must be used during conversion to another float format.

https://github.com/llvm/llvm-project/pull/145203


More information about the llvm-commits mailing list