[LLVMdev] LLVM Parallel IR

Tue Mar 10 01:36:10 PDT 2015

> On March 9, 2015 at 6:52 PM Renato Golin <renato.golin at linaro.org> wrote:
> 
> 
> On 9 March 2015 at 17:30, Tobias Grosser <tgrosser at inf.ethz.ch> wrote:
> > If my memories are right, one of the critical issues (besides
> > other engineering considerations) was that parallelism metadata in LLVM is
> > optional and can always be dropped. However, for
> > OpenMP it sometimes is incorrect to execute a loop sequential that has been
> > marked parallel in the source code.
> 
> Exactly. The fact that metadata goes stale quickly is not a flaw, but
> a design decision. If the producing pass is close enough from the
> consuming one, you should be ok. If not, then proving legality might
> be tricky and time consuming. The problem with OpenMP and other
> vectorizer pragmas is that the metadata is introduced by the
> front-end, and it's a long way down the pipeline for it to be
> consumed. Having said that, it works ok if you keep your code simple.

I know that this was a long discussion and that the "breakability" of parallel
loop infos is the result of a design decision. And I also believe that this is
a good way as long as parallelism is not part of the contract with the user
(i.e., the programmer when placing explicit parallelism annotations, or the
language designer introducing parallelism into the semantics of the language).
Tobias already mentioned problems with breaking OpenMP semantics. Similarly
different forms of parallelism, like task parallelism, could certainly be
represented using metadata, say by extracting the task code to a function and
annotating the call as being spawned for parallel execution. Again,
optimizations could break it, violating a possible contract with the user.

Alternatively we could introduce intrinsics, which I currently do. This would
forbid certain optimizations like moving potential memory accesses in and out
of "parallel code sections" and therefore does not break parallelism that
often.  The headaches that this approach causes me are that basic analyses like
dominance, reachability and the like are broken in that setting as everything
computed in one parallel task, followed by another parallel task in the cfg
does not dominate or even reach the second task. This of course influences the
precision and correctness of optimizations, like for instance redundant code
elimination or GVN.

> I'd be interested in knowing what in the IR cannot be accomplished in
> terms of metadata for parallelization, and what would be the new
> constructs that needed to be added to the IR in order to do that. If
> there is benefit for your project at the same time as for OpenMP and
> our internal vectorizer annotation, changing the IR wouldn't be
> impossible. We have done the same for exception handling...

I understand that parallelism is a very invasive concept and introducing it
into a so far "sequential" IR will cause severe breakage and headaches. But I
am afraid that if we accept parallelism as being a first class citizen, then I
would prefer doing it as a core part of the IR.  One possibility to do this
gradually might also be to have a seperate, parallel, IR, say PIR, that will be
lowered to regular IR at some point (however this point is chosen). Existing
optimizations can then be gradually moved from the regular IR phase to the PIR
phase where appropriate and useful.  Nevertheless I do not propose to do such a
thing in LLVM right now. I think this might be an option for a (bigger)
research project at first.

I'd be happy to hear further thoughts about that.

Cheers,

---

Kevin Streit
Neugäßchen 2
66111 Saarbrücken

Tel. +49 (0)151 23003245
streit at mailbox.org · http://www.kevinstreit.de