[LLVMdev] [RFC] OpenMP Representation in LLVM IR
andreybokhanko at gmail.com
Sat Sep 29 10:16:21 PDT 2012
Thank you for the reply!
> As you may know, this is the third such proposal over the past two
> months, one by me
> and the other, based somewhat on mine, by Sanjoy
Yes, I was aware of your proposal. I hesitated to make any comments or
criticism -- as I am, obviously, biased.
In my opinion, two most important differences between our proposals are:
1) Your design employs explicit procedurization done in front-end,
while our design allows both early (right after front-end) and late
(later in the back-end) procedurization.
2) You aim to provide a general support for all (or at least most)
parallel standards, while our aim is more modest -- just OpenMP.
Please see discussion of 1) in "Function Outlining" section of our proposal.
As for 2), there are many arguments one might use in favor of more
general or more specialized solution. What is easier to implement?
What is better for LLVM IR development? Are we sure what we see as
necessary and sufficient today would be suitable for future parallel
standards -- given all the developments happening in this area as we
speak? Whatever one answers, it would be quite subjective. My personal
preference is for simplest and most focused solution -- but then again
this is subjective.
> In order for your proposal to work well, there will be a lot of
> infrastructure work required (more than with my proposal); many passes
> will need to be made explicitly aware of how they can, or can't, reorder
> things with respect to the parallelization intrinsics; loop
> restructuring may require special care, etc. How this is done depends
> in part on where the state information is stored: Do we keep the
> parallelization information in the intrinsics during mid-level
> optimization, or do we move its state into an analysis pass? In any
> case, I don't object to this approach so long as we have a good plan
> for how this work will be done.
No -- only passes than happen before procedurization should be aware
of these intrinsics.
I agree that it is not so easy to make optimizations "thread-aware".
But the problem essentially the same, no matter how parallel extension
is manifested in the IR.
> When we discussed this earlier this year, there seemed to be some
> consensus that we wanted to avoid, to the extent possible, introducing
> OpenMP-specific intrinsics into the LLVM layer. Rather, we should
> define some kind of parallelization API (in the form of metadata,
> intrinsics, etc.) onto which OpenMP can naturally map along with other
> paradigms. There is interest in supporting OpenACC, for example, which
> will require data copying clauses, and it would make sense to share
> as much of the infrastructure as possible with OpenMP. Are you
> interested in providing Cilk support as well? We probably don't want to
> have NxM slightly-different ways of expressing 'this is a parallel
> region'. There are obviously cases in which things need to be specific
> to the interface (like runtime loop scheduling in OpenMP which implies
> a specific interaction with the runtime library), but such cases may be
> the exception rather than the rule.
> We don't need 'omp' in the intrinsic names and also 'OMP_' on all of
> the string specifiers. Maybe, to my previous point, we could call the
> intrinsics 'parallel' and use 'OMP_' only when something is really
As I said before, our aim was quite simple -- OpenMP support only.
Can the design be extended to allow more general form of parallel
extensions support? Probably... but this is definitely more than what
> You don't seem to want to map thread-private variables onto the
> existing TLS support. Why?
Because we don't employ explicit procedurization. What happens after
procedurization (including how thread-private variables are manifested
in the IR) is heavily dependent on OpenMP runtime library one relies
upon and out of scope of our proposal.
Intel Compiler Team
More information about the llvm-dev