[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

Tue Oct 2 21:41:12 PDT 2012

On Tue, 2 Oct 2012 14:28:25 +0000
"Adve, Vikram Sadanand" <vadve at illinois.edu> wrote:

> Hal, Andrey, Alexey,
> 
> From the LLVM design viewpoint, there is a fundamental problem with
> both Hal's approach and the Intel approach: both are quite
> language-specific.  OpenMP is a particular parallel language, with
> particular constructs (e.g., parallel regions) and semantics.  LLVM
> is a language-neutral IR and infrastructure and OpenMP-specific
> concepts should not creep into it.  I've included an excerpt from
> Hal's proposal below, which shows what I mean: the  design is couched
> in terms of OpenMP parallel regions.  Other parallel languages, e.g,
> Cilk, have no such notion.  The latest Intel proposal is at least as
> OpenMP-specific.
> 
> I do agree with the general goal of trying to support parallel
> programming languages in a more first-class manner in LLVM than we do
> today.  But the right approach for that is to be as language-neutral
> as possible.  For example, any parallelism primitives in the LLVM IR
> and any LLVM- or machine-level optimizations that apply to parallel
> code should be applicable not only to OpenMP but also to languages
> like Cilk, Java/C#, and several others.  I think "libraries" like
> Intel's TBB should be supported, too: they have a (reasonably)
> well-defined semantics, just like languages do, and are become widely
> used.  
> 
> I also do not think LLVM metadata is the way to represent the
> primitives, because they are simply too fragile.   But you don't have
> to argue that one with me :-), others have argued this already.  You
> really need more first class, language-neutral, LLVM mechanisms for
> parallelism.  I'm not pretending I know how to do this, though there
> are papers on the subject, including one from an Intel team (Pillar:
> A Parallel Implementation Language, LCPC 2007).

I took a quick look at this paper; essentially, their language
introduces continuations and several different types of 'parallel call'
functions. Do we want continuations? If we want to go down this road,
then I think that something like Sanoy's parallel_map is a good idea. My
worry with these restricted approaches is that correctly implementing
OpenMP's semantics may prove inefficient (if not impossible). The
specification dictates specific interactions between the runtime
library, certain environmental variables, and the pragmas. I think that
showing that this will work will require a specific proposal and an
explicit construction of the mapping. Moreover, I fear that restricting
parallelism to some special types of call instructions will inhibit
useful loop optimizations on parallelized loops. Since parallel loop
performance is one of the most important measures for the quality of a
parallelizing compiler, we need to consider the impact on loop
optimizations carefully.

 -Hal

> 
> --Vikram
> Professor, Computer Science
> University of Illinois at Urbana-Champaign
> http://llvm.org/~vadve
> 
> 
> 
> 
> > To mark this function as a parallel region, a module-level
> > 'parallel' metadata entry is created. The call site(s) of this
> > function are marked with this metadata,. The metadata has entries:
> >  - The string "region"
> >  - A reference to the parallel-region function
> >  - If applicable, a list of metadata references specifying
> > special-handling child regions (parallel loops and
> > serialized/critical regions)
> > 
> > If the special-handling region metadata is no longer referenced by
> > code within the parallel region, then the region has become
> > invalid, and will be removed (meaning all parallelization metadata
> > will be removed) by the ParallelizationCleanup. The same is true
> > for all other cross-referenced metadata below.
> > 
> > Note that parallel regions can be nested.
> > 
> > As a quick example, something like:
> > int main() {
> >   int a;
> > #pragma omp parallel firstprivate(a) 
> >   do_something(a)
> >   ...
> > }
> > 
> > becomes something like:
> > 
> > define private void @parreg(i32 %a) {
> > entry:
> >   call void @do_something(i32 %a)
> >   ret
> > }
> > 
> > define i32 @main() {
> > entry:
> > ...
> > call void @parreg1(i32 %a) !parallel !0
> > ...
> > 
> > !0 = metadata !{ metadata !"region", @parreg }
> > 
> 
> 
> --Vikram
> Professor, Computer Science
> University of Illinois at Urbana-Champaign
> http://llvm.org/~vadve
> 
> 
> 
> 
> 
> 

-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory