[llvm-dev] [RFC] Adding thread group semantics to LangRef (motivated by GPUs)
Nicolai Hähnle via llvm-dev
llvm-dev at lists.llvm.org
Sat Dec 29 08:32:06 PST 2018
On 20.12.18 18:03, Connor Abbott wrote:
> We already have the notion of "convergent" functions like
> syncthreads(), to which we cannot add control-flow dependencies.
> That is, it's legal to hoist syncthreads out of an "if", but it's
> not legal to sink it into an "if". It's not clear to me why we
> can't have "anticonvergent" (terrible name) functions which cannot
> have control-flow dependencies removed from them? ballot() would be
> both convergent and anticonvergent.
>
> Would that solve your problem?
>
>
> I think it's important to note that we already have such an attribute,
> although with the opposite sense - it's impossible to remove control
> flow dependencies from a call unless you mark it as "speculatable".
This isn't actually true. If both sides of an if/else have the same
non-speculative function call, it can still be moved out of control flow.
That's because doing so doesn't change anything at all from a
single-threaded perspective. Hence why I think we should model the
communication between threads honestly.
> However, this doesn't prevent
>
> if (...) {
> } else {
> }
> foo = ballot();
>
> from being turned into
>
> if (...) {
> foo1 = ballot();
> } else {
> foo2 = ballot();
> }
> foo = phi(foo1, foo2)
>
> and vice versa. We have a "noduplicate" attribute which prevents
> transforming the first into the second, but not the other way around. Of
> course we could keep going this way and add a "nocombine" attribute to
> complement noduplicate. But even then, there are even still problematic
> transforms. For example, take this program, which is simplified from a
> real game that doesn't work with the AMDGPU backend:
>
> while (cond1 /* uniform */) {
> ballot();
> ...
> if (cond2 /* non-uniform */) continue;
> ...
> }
>
> In SPIR-V, when using structured control flow, the semantics of this are
> pretty clearly defined. In particular, there's a continue block after
> the body of the loop where control flow re-converges, and the only back
> edge is from the continue block, so the ballot is in uniform control
> flow. But LLVM will get rid of the continue block since it's empty, and
> re-analyze the loop as two nested loops, splitting the loop header in
> two, producing a CFG which corresponds to this:
>
> while (cond1 /* uniform */) {
> do {
> ballot();
> ...
> } while (cond2 /* non-uniform */);
> ...
> }
>
> Now, in an implementation where control flow re-converges at the
> immediate post-dominator, this won't do the right thing anymore. In
> order to handle it correctly, you'd effectively need to always flatten
> nested loops, which will probably be really bad for performance if the
> programmer actually wanted the second thing. It also makes it impossible
> when translating a high-level language to LLVM to get the "natural"
> behavior which game developers actually expect. This is exactly the sort
> of "spooky action at a distance" which makes me think that everything
> we've done so far is really insufficient, and we need to add an explicit
> notion of control-flow divergence and reconvergence to the IR. We need a
> way to say that control flow re-converges at the continue block, so that
> LLVM won't eliminate it, and we can vectorize it correctly without
> penalizing cases where it's better for control flow not to re-converge.
Well said!
Cheers,
Nicolai
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
More information about the llvm-dev
mailing list