[LLVMdev] multi-threading in llvm

Tobias Grosser tobias at grosser.es
Mon Sep 12 08:39:49 PDT 2011


On 09/12/2011 03:18 PM, Dmitry N. Mikushin wrote:
> Hi Alexandra,
>
> I don't know much, maybe this topic should be bridged with polly-dev
> (adding it to CC) to bring it more attention.

Thanks Dimitry for moving this over. I would also have replied on the 
LLVM list, but was away this weekend. Now I am back.

> Indeed, polly uses ScopPass, that creates serious limitations in
> compatibility with other passes. To my understanding, scops are used
> because ISL loop analysis tool uses scops.
Scops are used, as this is the problem domain we are working on. Scops 
are not just loops, but can also be nested conditions without any loops.
Also describing them as loop passes is not what we want, as the idea of 
our polyhedral description is actually to abstract away the specific 
structure of the loops.
I would be very interested to hear what use cases you believe are 
limited because of the use of ScopPass? If you are e.g. only interested 
in Dependency Information or Alias Information, it might be possible to
create a LoopPass that proxies the relevant information, such that it is 
available to other passes.

 > In fact, just for handling
> OpenMP directives scops are not required, unless one need to make sure
> OpenMP directive is set for loop with parallel iterations.

Right. Scops are not needed to handle OpenMP directives and Polly is 
actually not ment to handle OpenMP directives. It is just one of several 
ways to introduce OpenMP parallelism. Polly does this if the loop it 
generate is parallel, clang would do it if the user added OpenMP 
directives and Alex may introduce parallelism in similar cases.

For me handling 'OpenMP directives' could mean two things:
1) clang understand OpenMP pragmas and lowers them to a set of OpenMP 
intrinsics/function calls and structures.

2) An LLVM optimization pass understands a set of high level OpenMP 
intrinsics that it can optimize and transform to specific libgomp or 
mpc.sf.net library calls.

Both are not yet available in LLVM, but would be highly useful. 
Especially 2) would be nice for Polly and probably also for Alexandra.

> Btw, it would be very interesting to know more about your
> project/purpose for this!
Oh yes. I am also highly interested.

> 2011/9/8 Jimborean Alexandra<xinfinity_a at yahoo.com>:
>> Hi,
>>
>> I want to execute the iterations of a loop in parallel, by inserting calls
>> either to pthreads or to the gomp library at the LLVM IR level. As a first
>> step, I inserted an omp pragma in a C file and compiled it with llvm-gcc to
>> check the generated LLVM code.
This is a good step. When you do this, make sure you use
'schedule (runtime)'. Otherwise gcc will use not only function calls to 
set up libgomp, but also inline instructions. This makes the code a lot 
harder to understand.

 >> If I understand correctly, to parallelize the
>> loop in LLVM IR, I have to separate the loop in a new function, put all
>> required parameters in a structure, make the call to the gomp library, and
>> restore all values from the structure in the original variables.
>> Also, I have to compute the number of iterations allocated to each thread
>> and insert in the loop body a new condition, such that each thread executes
>> only its slice.
>> Is that correct?

Partially. And it depends also on what kind of OpenMP parallel loop you 
want to generate. I suggest you want to generate a schedule(runtime) 
OpenMP parallel loop, as this is the easiest one to generate. Here you 
need basically do this:

Host function:
(The function that contains the loop you want to parallelize)

Here you replace the loop with calls to:

GOMP_parallel_loop_runtime_start(subfunction, subfunction_data,
                                  number_of_threads, lower_bound,
                                  upper_bound, stride)
subfunction()
GOMP_parallel_end()

subfunction is the address of a new function, called subfunction. 
subfunction_data, is the address of the structure that contains the data 
needed in the subfunction. The remaining arguments should be obvious.

subfunction is now basically:

int lower_bound, upper_bound;

while(GOMP_loop_runtime_next(*lower_bound, upper_bound)) {

	for (int i = lower_bound; i < upper_bound; i += stride) {
           // Put here your loop body
         }
}

>> As far as I know, both llvm-gcc and Polly already offer support for OpenMP,
>> by inserting calls to the gomp library.

Polly support automatic parallelization of the loops it generates. gcc 
(and therefore llvm-gcc) supports both user added OpenMP pragmas and 
automatically added OpenMP calls (provided by the -autopar pass).

 > Can this code be reused?
Depends on what you plan to do. The code in Polly is currently specific 
to Polly , only creates the calls to OpenMP that we need and basically 
builds an OpenMP loop from scratch.

What you most probably want is a pass, that takes an existing LLVM-IR 
loop and translates it into an OpenMP parallel loop.

 From Polly you could get the functions that create the definitions of 
the relevant LibGOMP functions, that set up the functions and that 
create the new loop structure. To get a pass that translates an LLVM-IR 
to LLVM-IR loop, you still need to implement the actual transformation.

>> Is there a pass that I can call to do all these code transformations?
No. LLVM has no included passes for OpenMP transformations. Polly just 
does code generation for its own, specific use case. gcc has some passes 
that lower high-level OpenMP calls to more individual OpenMP calls and 
calculations.

>> I had a look at the CodeGeneration from Polly. Is it possible to use it
>> without creating the Scops, by transforming it into a LoopPass?
No. It is not a pass that translates from LLVM-IR to LLVM-IR, but it 
creates new LLVM-IR from a polyhedral description. So without Scops and 
therefore without a polyhedral description it cannot be used.

>> Could you indicate how is this handled in llvm-gcc?
The only pass that generates OpenMP calls in llvm-gcc is the gcc 
-autopar pass. It basically detects if a loop is parallel and introduces 
calls to libgomp (I am not sure if it creates higher level intrinsics 
that are lowered by a subsequent gcc pass to the actual libgomp calls or 
equivalent instructions).

Alex, let me know what you are planning to do. Maybe we can work 
together on some OpenMP infrastructure in LLVM. I would love to have 
both a generic OpenMP builder that can be used by both your 
transformations and Polly, as well as a Pass that lowers high level 
OpenMP intrinsics to higher performance OpenMP code and low level 
function calls.

Cheers
Tobi



More information about the llvm-dev mailing list