[PATCH] D124232: [AMDGPU] Use d16 flag for image.sample instructions

Sebastian Neubauer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 27 08:42:30 PDT 2022


sebastian-ne requested changes to this revision.
sebastian-ne added a comment.
This revision now requires changes to proceed.

I wanted to add this combine before, but I don’t think there is a way to add d16 to an instruction without potentially breaking the code.
The reason is, when an image_sample has the d16 flag enabled, it will use f32→f16 truncation //or// i32→i16 truncation, depending on the texture format in the descriptor.

Combining image_sample+fptrunc to image_sample d16 works fine for float textures, but I assume we don’t know at compile time if a texture is an integer or float texture.
The application may interpret stored values as float and does an fptrunc, but the texture is actually defined as an integer texture, so the hardware uses an integer trunc instead, giving different results.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124232/new/

https://reviews.llvm.org/D124232



More information about the llvm-commits mailing list