[LLVMdev] Available code-generation parallelism

Thu Nov 6 13:15:41 PST 2008

On Nov 6, 4:22 am, Jonathan Brandmeyer <jbrandme... at earthlink.net>
wrote:
> On Mon, 2008-11-03 at 23:59 -0800, Chris Lattner wrote:
> > On Nov 3, 2008, at 3:55 PM, heisenbug wrote:
> > > What about "inventing" pseudo-constants (which point to the right
> > > thing) and build the piece of IR with them. When done, grab mutex and
> > > RAUW it in. Alternatively, submit to a privileged thread that performs
> > > the RAUW.
> > > The trick is to prepare the def/use chain(s) to a degree that the
> > > mutex is only held a minimal time. If only IR-builder threads are
> > > running concurrently there is no danger that a real constant vanishes,
> > > leaving behind a stale reference from a pseudo-constant.
>
> > That could work.  It would also have to be done for global values as  
> > well, and inline asm objects etc.  However, I don't see any show-
> > stoppers.  The implementation could be tricky, but a nice property of  
> > your approach is that the single threaded case could be made  
> > particularly fast (instead of doing atomic ops or locking always).
>
> It might work for the IR construction phase, but not for optimization
> and emitting object code.  The locking issue is going to be severe

When optimization is done, you can certainly go on and emit code
in parallel for many functions. Since the IR is not mutated any
more, nothing needs to be locked. I believe the SDGraphs can be
isolated from each other. But Dan may know better...

Cheers,

    Gabor

> because it will be nearly (completely?) impossible to guarantee a
> globally consistent lock order for any given def/use chain.  Therefore,
> such a solution would require a kind of high-level contention manager
> akin to software transactional memory (STM).  Even the fastest STMs in
> research are much slower than locking.  I think that there is a better
> way.
>
> I would like to propose a different solution: Lift all internalized
> objects to be unique at the Module level instead of globally.  This will
> require an initial pass to be performed (called IRFinalize?), and
> equality of Type objects by pointer comparison will not be valid until
> after this pass is complete.  The Module is already the unit of
> compilation once LLVM IR has been initially emitted for most cases, and
> it should be straightfoward to structure the pass such that it can work
> on single functions if the user is compiling at that level.
>
> The IRFinalize pass can allocate the bookkeeping storage for identifying
> duplicate Constants and Types and then release it so long as none of the
> optimization and analysis passes 1) Emit new Types or 2) are broken by
> duplicate Constants.
>
> -Jonathan
>
> PS: What is RAUW?  I'll volunteer the clerical work of adding it to the
> Lexicon if you'd be kind enough to hand me a small dose of clue :)
>
> > -Chris
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVM... at cs.uiuc.edu        http://llvm.cs.uiuc.edu
> >http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> LLVM... at cs.uiuc.edu        http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev