[llvm] [AMDGPU] Optimize image sample followed by llvm.amdgcn.cvt.pkrtz into d16 variant (PR #145203)
Russell Liu via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 3 09:29:05 PDT 2025
================
@@ -247,6 +247,42 @@ simplifyAMDGCNImageIntrinsic(const GCNSubtarget *ST,
ArgTys[0] = User->getType();
});
}
+
+ // Fold image.sample + cvt.pkrtz -> extractelement idx0 into a single
+ // d16 image sample.
----------------
GinShio wrote:
I agree we should deal with carefully.
> This is exactly the issue I mentioned. For fixed point formats, the data is not converted to fp32 and then truncated to fp16, it's directly rounded to fp16.
Not sure if the failed case is `piglit/arb_texture_view-rendering-formats`. [_fs*\_float16_](https://gitlab.freedesktop.org/mesa/piglit/-/blob/main/tests/spec/arb_texture_view/rendering-formats.c?ref_type=heads#L287-L288) is similar to this pattern:
+ the texture format is 16f,
+ sampled type is float,
+ result is used by `PackHalf2x16`.
> But then if it's OK to use the normal round-to-nearest-even mode then why would you generate pkrtz instructions in the first place?
I guess it's generated by SPIR-V opcode `PackHalf2x16`, to fix VKD3D-Proton issue.
References:
+ https://github.com/KhronosGroup/Vulkan-Docs/issues/1825
+ https://github.com/GPUOpen-Drivers/llpc/blob/40cb8d95ad8d6f7f1652e3fd47d39667594cce08/llpc/translator/lib/SPIRV/SPIRVReader.cpp#L10609-L10614
+ Direct3D11 Functional Specification 3.2.1 Floating Point Conversion
> Round-to-zero must be used during conversion to another float format.
https://github.com/llvm/llvm-project/pull/145203
More information about the llvm-commits
mailing list