[llvm-dev] Multi-Threading Compilers

Tue Mar 24 23:52:02 PDT 2020

On 3/19/20 7:37 PM, Nicholas Krause wrote:
 >
 >
 > On 3/19/20 5:31 PM, Johannes Doerfert wrote:
 >>
 >> On 3/18/20 9:05 PM, Nicholas Krause wrote:
 >> >
 >> >
 >> > On 3/18/20 9:49 AM, Nicolai Hähnle wrote:
 >> >> On Wed, Mar 18, 2020 at 7:23 AM Nicholas Krause via llvm-dev
 >> >> <llvm-dev at lists.llvm.org> wrote:
 >> >>> On 3/3/20 8:37 PM, Chris Lattner wrote:
 >> >>>
 >> >>> On Feb 28, 2020, at 6:03 PM, Chris Lattner <clattner at nondot.org> 
wrote:
 >> >>>
 >> >>>
 >> >>> On Feb 28, 2020, at 8:56 AM, Johannes Doerfert 
<johannesdoerfert at gmail.com> wrote:
 >> >>>
 >> >>> On 02/28, Nicholas Krause via llvm-dev wrote:
 >> >>>
 >> >>> Anyhow what is the status and what parts are we planning to move to
 >> >>> MLIR in LLVM/Clang.  I've not seen any discussion on that other than
 >> >>> starting to plan for it.
 >> >>>
 >> >>>
 >> >>> As far as I know, there is no (detailed/discussed/agreed 
upon/...) plan
 >> >>> to move any existing functionality in LLVM-Core or Clang to 
MLIR. There
 >> >>> are some people that expressed interest in there is Chris's plan 
on how
 >> >>> the transition could look like.
 >> >>>
 >> >>>
 >> >>> Yep, agreed, I gave a talk a couple days ago (with Tatiana) with 
a proposed path forward, but explained it as one possible path.  We’ll 
share the slides publicly in a few days after a couple things get taken 
care of.
 >> >>>
 >> >>>
 >> >>> Hi all,
 >> >>>
 >> >>> Here is a link to the CGO presentation slides (outlining a 
possible path to incremental adoption of MLIR in Clang) for anyone curious.
 >> >>>
 >> >>> -Chris
 >> >>>
 >> >>> Greetings,
 >> >>> As to David Blaike's suggestion I'm merging the two threads for 
this discussion. The original commenters is Johannes Doefert
 >> >>> starting with Hey,:
 >> >>>
 >> >>> Hey,
 >> >>>
 >> >>> Apologies for the wait, everything right now is going crazy..
 >> >>>
 >> >>> Compiler Folks are very busy people as there aren't as much of 
us unfortunately so no need to
 >> >>> apologize. I've yet to heard from someone on the GCC side and 
will wait until after GCC 11
 >> >>> is released due to this. Also not to mention the health issues 
of Coronavirus-19.
 >> >>>
 >> >>>
 >> >>> I think we should early in move this conversation on the llvm 
Dev list but generally speaking we can see three options here:
 >> >>> 1) parallelize single passes or a subset of passes that are 
known to not interfer, e.g. the attributor,
 >> >>> 2) parallelize analysis pass execution before a transformation 
that needs them,
 >> >>>
 >> >>> 3) investigate what needs to be done for a parallel execution of 
many passes, e.g. How can we avoid races on shared structure such as the 
constant pool.
 >> >>>
 >> >>> I was researching this on and off for the last few months in 
terms of figuring out how to make the pass manager itself async. Its not 
easy and I'm not even
 >> >>> sure if that's possible. Not sure about GIMPLE as I would have 
to ask the middle end maintainer on the GCC side but LLVM IR does not 
seem to have shared
 >> >>> state detection or the core classes and same for the back ends. 
So yes this would interest me.
 >> >>>
 >> >>> The first place to start with is which data structures are 
shared for sure. The biggest ones seem to be basic blocks and function 
definitions in terms of shared state, as
 >> >>> those would be shared by passes running on each function.  We 
should start looking at implementing here locks or ref counting here 
first if your OK with that.
 >> >>> It also allows me  to understand a little more concrete the 
linkage between the core classes as would be required for multi 
threading LLVM. In addition,
 >> >>> it allows us to look into partitioning issues with threads at 
the same thing in terms of how to do it.
 >> >>>
 >> >>> As was discussed on the previous thread - generally the 
assumption is that one wouldn't try to run two function optimizations on 
the same function at the same time, but, for instance - run function 
optimizations on unrelated functions at the same time (or CGSCC passes 
on distinct CGSCCs). But this is difficult in LLVM IR because use lists 
are shared - so if two functions use the same global variable or call 
the same 3rd function, optimizing out a function call from each of those 
functions becomes a write to shared state when trying to update the use 
list of that 3rd function. MLIR apparently has a different design in 
this regard that is intended to be more amenable to these situations.
 >> >> As mentioned on the other thread, the main challenge here is in the
 >> >> use lists of constant values (which includes also globals and
 >> >> functions). Right now, those are linked lists that are global for an
 >> >> entire LLVMContext. Every addition or removal of a use of a constant
 >> >> has to touch them, and doing such fine-grained locking doesn't seem
 >> >> like a great idea.
 >> > GCC has the same issues it terms of certain core structures so not
 >> > really surprised.
 >> >>
 >> >> So this is probably the biggest and seemingly most daunting thing
 >> >> you'd have to address first, but it's feasible and seems like a good
 >> >> idea to evolve LLVM IR in a direction where it ends up looking more
 >> >> like MLIR and can avoid these locks.
 >> > Sure that makes sense I will see what Johannes wants to start with.
 >>
 >> I think addressing this issue first makes sense. I would however start
 >> by determining the actual impact of different design choices here. I
 >> mean, do we know locks will be heavily contented? If I had to guess I'd
 >> say most passes will not create or modify functions nor add or remove
 >> calls. I further guess that passes which create/query llvm::Constant
 >> values will do so for ConstantInt between -1 and 2, I mean most of the
 >> time. This might be wrong but we should for sure check before we
 >> redesign the entire constant handling (as MLIR did). My suggestion is to
 >> profile first. What we want is to monitor the use-list of constants but
 >> I'm not sure if that is easy off the top of my head. What we can do
 >> easily is to print a message in the methods that are used to "create" a
 >> constant, thus the constructors (of llvm::Constant) and the
 >> ConstantXXX::get() methods. We print the pass names and these "constant
 >> generation" messages in a run of the test suite and analyze the result.
 >> What passes create constants, how often, which (kind of) constants, etc.
 >> We should also determine if any pass ever walks the use list of
 >> constants. I know we do it for global symbols but I don't know we do it
 >> for others. That said, I think it is sensible to distinguish global
 >> symbols and other constants at some point because (I think) we use them
 >> differently.
 >>
 >> From there we decide how to move forward. Localized constants, as MLIR
 >> has them, some locking (or similar solution), or maybe just restrictions
 >> on the parallel execution of passes.
 >>
 >> I hope this makes some sense.
 > If your talking about this class:
 > https://llvm.org/doxygen/classllvm_1_1Constant.html

Yes.

 > Then yes that makes sense in terms of getting the data. The only 
three questions I would
 > ask are:
 > 1. There are methods in pass classes for getting the name of them. 
However the problem
 > would be adding a switch statement to check what were passing into 
the constructor. I'm
 > not sure of what the top level getName e.t.c. class is. Therefore if 
there is one I'm assuming
 > Value or ModulePass. This makes it easier as you don't have to wonder 
about data types
 > but walk down the virtual hierarchy for this. Names at least to my 
knowledge are a top
 > level feature and we should just use it there. So just do:
 > XXX->GetName() where XXX is the top level class(es).

You lost me. I was thinking to add a print in all the interesting
llvm::Constant classes. If you run opt (or clang) with
-debug-pass=Details you should see what is printed interleaved with
messages what passes are run. So you can attribute passes to actions on
constants. Exactly this might not work but there should be a way to do
this.

 > 2. Lock contention is rather vague. For example we could have 
multiple readers and few writers.
 > So we would also need to figure that out in order to choose the right 
locks. Thoughts?

A lot, but at the end it depends on what we are dealing with. Let's
postpone this discussion.

 > 3. Does LLVM have something like the GCC compile farm as at most point
 > were going to need to test on larger machines for scaling and race
 > conditions.

We have buildbots but most people have access to larger machines.
Testing and measuring clang on a large node or on multiple nodes will be
the least of our problems ;)

 > The only other thing is LLVM seems to be late rc so it doesn't matter
 > to me but do we want to wait for LLVM 10 final to be released as this
 > is new work. I don't think it matters frankly as this is way off being
 > in mainline anything soon.

Agreed. We need to gather data. Determine options forward, present those
options to the community and start a discussion, prototype options to
see how they pan out, hopefully finish the discussion, implement the
desired solution, go through review and testing, merge and enable it, in
roughly this order.

 > Thanks and sorry if I mistaken, Nick
 >
 >>
 >> Cheers, Johannes
 >>
 >>
 >>
 >