[PATCH] D70781: AMDGPU: Fix handling of infinite loops in fragment shaders

Tue Dec 3 00:27:11 PST 2019

critson added a comment.

In D70781#1765291 <https://reviews.llvm.org/D70781#1765291>, @cwabbott wrote:

> In D70781#1764809 <https://reviews.llvm.org/D70781#1764809>, @critson wrote:
>
> >   The extra export is for not great for performance as it introduces an unnecessary stall at the end of the shader.
>
>
> I don't expect this specific case (an otherwise-infinite loop with a discard) to happen often enough with "real" shaders for performance to matter. After all, this went completely unnoticed until it showed up in a CTS test that was created by a fuzzer.

Thanks for debugging my thoughts here. I missed that this is only triggered in the dummy return block case. I agree this is totally reasonable.

>>   The extra export overwrites the set of active lanes set by a correct export done earlier in the shader.
> 
> I think this isn't an issue because the exec mask at the end of the program is going to be the same as the mask when the last "real" export happens. The way `kill` is implemented means that control flow never reconverges for a killed thread, so it stays dead until the very end. I've done some manual tests with the aforementioned CTS test, and it does seem to be properly discarding the right pixels too. But that's a good question :)

I agree in the general case that should hold; however, if there was a discard //after// the true export done then it wouldn't.  I don't have a good use case for why such a thing would happen -- exports should in the main be the last thing a (PS) shader does.  So if we consider `kill` after `export done` semantically invalid then this is fine.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70781/new/

https://reviews.llvm.org/D70781