[PATCH] D20071: AMDGPU: Make some instructions convergent

Mon May 9 22:54:12 PDT 2016

resistor added a subscriber: resistor.
resistor added a comment.

In http://reviews.llvm.org/D20071#424886, @arsenm wrote:

> In http://reviews.llvm.org/D20071#424843, @nhaehnle wrote:
>
> > LGTM
> >
> > Out of curiosity, what does making v_readlane/v_writelane convergent fix? I thought they were independent of control flow...
>
>
> Now I'm not really sure about them. I was just thinking any of the instructions that do any kind of crosslane interactions would be convergent

Assuming that they do what I imagine (read the corresponding register from a neighboring thread), they need to be convergent in order to ensure that the desired value hasn't been clobbered on the neighboring thread.  Consider:

r0 = readlane(r0)
if ( ...) {

  // r0 unused, gets reused as scratch register

} else {

  // use r0

}

If we sink the readlane into the else block things break.  Imagine the scenario where thread 0 in the wavefront takes the if branch, but thread 1 takes the else branch.  Because they execute in lockstep, thread 0's code will execute first, and clobber the value in its r0.  By the time thread 1 gets to run the sunken readlane, the proper value is no longer available in the neighbor's r0.

http://reviews.llvm.org/D20071