[LLVMdev] Available code-generation parallism

Thu Nov 6 18:55:27 PST 2008

On Mon, 2008-11-03 at 01:06 -0800, Chris Lattner wrote:
> On Nov 2, 2008, at 2:20 PM, Jonathan Brandmeyer wrote:
> > I am interested in making my LLVM front-end multi-threaded in a way
> > similar to the GCC compiler server proposal and was wondering about  
> > the
> > extent that the LLVM passes support it.
> 
> Do you have a link for this?  I'm not familiar with any parallelism  
> proposed by that project.  My understanding was that it was mostly  
> about sharing across invocations of the compiler.

Nope, you're right.  I'm not sure where I got that idea, but I certainly
don't see it in their whitepaper.

> Are you talking about building your AST or about building LLVM IR.   
> The rules for constructing your AST are pretty much defined by you.   
> The rules for constructing LLVM IR are a bit more tricky.  The most  
> significant issue right now is that certain objects in LLVM IR are  
> uniqued (like constants) and these have use/def chains.  Since use/def  
> chain updating is not atomic or locked, this means that you can't  
> create llvm ir on multiple threads.  This is something that I'm very  
> much interested in solving someday, but no one is working on it at  
> this time (that I'm aware of).

I'm referring to implementing the construction, optimization, and object
code generation in parallel.

> > Function-at-a-time parallel construction:
> > Which (if any) LLVM objects support the object-level thread safety
> > guarantee?  If I construct two separate function pass managers in
> > separate threads and use them to optimize and emit object code for
> > separate llvm::Function definitions in the program, will this work?
> > Same question for llvm::Modules.
> 
> Unfortunately, for the above reason... basically none.  The LLVM code  
> generators are actually very close to being able to run in parallel.   
> The major issue is that they run a few llvm IR level passes first (LSR  
> and codegen prepare) that hack on LLVM IR before the code generators  
> run.  Because of this, they inherit the limitations of LLVM IR  
> passes.  Very long term, I'd really like to make the code generator  
> not affect the LLVM IR being put into them, but this is not likely to  
> happen anytime in the near future.

> If you're interested in this, tackling the use/def atomicity issues  
> would be a great place to start.

What about lazy unification of uniqued values after IR construction?  If
that pass is performed on a per-module basis, then all of the Modules
will be isolated in memory from each other.  The front-end can partition
its source into N modules in whatever way it sees fit.  Then it can
instantiate a PassManager and Module per thread and build the IR into
them.  That isn't quite as nice as taking advantage of per-function
parallelism where the individual passes allow it, but it would be a step
in the right direction.

Why are Constants uniqued?  Is it purely for the memory savings?

-Jonathan