[PATCH] D116943: AMDGPU/GlobalISel: Explicitly track d16 for image legalization

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 10 10:56:42 PST 2022


arsenm added a comment.

In D116943#3231807 <https://reviews.llvm.org/D116943#3231807>, @sebastian-ne wrote:

> Looks good to me, the code looks a lot better than before.
>
> Having a float image store with dmask=15 sounds bad. What happens if the load/store is assigned the last allocated register? I.e.
>
>   ; Shader allocates v[0:63]
>   image_store v63, v[1:2], s[0:7] dmask:0xf dim:SQ_RSRC_IMG_2D unorm
>
> Will the hardware try to read v64–v66 and hang because they are not allocated?
> (I remember there was an issue with d16 and the hardware incorrectly computing the register requirement, but not sure if that leads to hangs or values getting lost.)

I don't know, but my guess would be the hardware ignores the high bits of the mask, and we should probably teach codegen to ignore them too and select (at least in the load case). I believe the out of bounds register access behavior has always been discard writes, return v0


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116943/new/

https://reviews.llvm.org/D116943



More information about the llvm-commits mailing list