[PATCH] D124232: [AMDGPU] Use d16 flag for image.sample instructions
Sebastian Neubauer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 27 08:42:30 PDT 2022
sebastian-ne requested changes to this revision.
sebastian-ne added a comment.
This revision now requires changes to proceed.
I wanted to add this combine before, but I don’t think there is a way to add d16 to an instruction without potentially breaking the code.
The reason is, when an image_sample has the d16 flag enabled, it will use f32→f16 truncation //or// i32→i16 truncation, depending on the texture format in the descriptor.
Combining image_sample+fptrunc to image_sample d16 works fine for float textures, but I assume we don’t know at compile time if a texture is an integer or float texture.
The application may interpret stored values as float and does an fptrunc, but the texture is actually defined as an integer texture, so the hardware uses an integer trunc instead, giving different results.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124232/new/
https://reviews.llvm.org/D124232
More information about the llvm-commits
mailing list