[PATCH] D39822: [NVPTX] Model (some) side effects of warp-synchronous data exchange intrinsics.

Artem Belevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 9 10:04:24 PST 2017


tra added a comment.

In https://reviews.llvm.org/D39822#920550, @arpith-jacob wrote:

> I was not sure if the *_sync intrinsics required preventing CSE since these intrinsics capture all state as arguments (lanes in a warp to sync as an argument).  However, on Volta, I think different lanes in a warp can execute the intrinsic from different syntactic locations (i.e., different program counters).  If true, then we do indeed have to model the data exchanged.


PTX spec says : `wait until all non-exited threads corresponding to membermask have executed vote.sync with the same qualifiers and same membermask value` followed by a caveat `For .target sm_6x or below, all threads in membermask must execute the same vote.sync instruction in convergence, and only threads belonging to some membermask can be active when the vote.sync instruction is executed. Otherwise, the behavior is undefined.`

My reading of this matches yours -- `the same instruction, executed in convergence` does not apply to sm_70.


https://reviews.llvm.org/D39822





More information about the llvm-commits mailing list