[llvm-dev] Multi-Threading Compilers

Thu Mar 19 14:31:19 PDT 2020

On 3/18/20 9:05 PM, Nicholas Krause wrote:
 >
 >
 > On 3/18/20 9:49 AM, Nicolai Hähnle wrote:
 >> On Wed, Mar 18, 2020 at 7:23 AM Nicholas Krause via llvm-dev
 >> <llvm-dev at lists.llvm.org> wrote:
 >>> On 3/3/20 8:37 PM, Chris Lattner wrote:
 >>>
 >>> On Feb 28, 2020, at 6:03 PM, Chris Lattner <clattner at nondot.org> wrote:
 >>>
 >>>
 >>> On Feb 28, 2020, at 8:56 AM, Johannes Doerfert 
<johannesdoerfert at gmail.com> wrote:
 >>>
 >>> On 02/28, Nicholas Krause via llvm-dev wrote:
 >>>
 >>> Anyhow what is the status and what parts are we planning to move to
 >>> MLIR in LLVM/Clang.  I've not seen any discussion on that other than
 >>> starting to plan for it.
 >>>
 >>>
 >>> As far as I know, there is no (detailed/discussed/agreed upon/...) plan
 >>> to move any existing functionality in LLVM-Core or Clang to MLIR. There
 >>> are some people that expressed interest in there is Chris's plan on how
 >>> the transition could look like.
 >>>
 >>>
 >>> Yep, agreed, I gave a talk a couple days ago (with Tatiana) with a 
proposed path forward, but explained it as one possible path.  We’ll 
share the slides publicly in a few days after a couple things get taken 
care of.
 >>>
 >>>
 >>> Hi all,
 >>>
 >>> Here is a link to the CGO presentation slides (outlining a possible 
path to incremental adoption of MLIR in Clang) for anyone curious.
 >>>
 >>> -Chris
 >>>
 >>> Greetings,
 >>> As to David Blaike's suggestion I'm merging the two threads for 
this discussion. The original commenters is Johannes Doefert
 >>> starting with Hey,:
 >>>
 >>> Hey,
 >>>
 >>> Apologies for the wait, everything right now is going crazy..
 >>>
 >>> Compiler Folks are very busy people as there aren't as much of us 
unfortunately so no need to
 >>> apologize. I've yet to heard from someone on the GCC side and will 
wait until after GCC 11
 >>> is released due to this. Also not to mention the health issues of 
Coronavirus-19.
 >>>
 >>>
 >>> I think we should early in move this conversation on the llvm Dev 
list but generally speaking we can see three options here:
 >>> 1) parallelize single passes or a subset of passes that are known 
to not interfer, e.g. the attributor,
 >>> 2) parallelize analysis pass execution before a transformation that 
needs them,
 >>>
 >>> 3) investigate what needs to be done for a parallel execution of 
many passes, e.g. How can we avoid races on shared structure such as the 
constant pool.
 >>>
 >>> I was researching this on and off for the last few months in terms 
of figuring out how to make the pass manager itself async. Its not easy 
and I'm not even
 >>> sure if that's possible. Not sure about GIMPLE as I would have to 
ask the middle end maintainer on the GCC side but LLVM IR does not seem 
to have shared
 >>> state detection or the core classes and same for the back ends. So 
yes this would interest me.
 >>>
 >>> The first place to start with is which data structures are shared 
for sure. The biggest ones seem to be basic blocks and function 
definitions in terms of shared state, as
 >>> those would be shared by passes running on each function.  We 
should start looking at implementing here locks or ref counting here 
first if your OK with that.
 >>> It also allows me  to understand a little more concrete the linkage 
between the core classes as would be required for multi threading LLVM. 
In addition,
 >>> it allows us to look into partitioning issues with threads at the 
same thing in terms of how to do it.
 >>>
 >>> As was discussed on the previous thread - generally the assumption 
is that one wouldn't try to run two function optimizations on the same 
function at the same time, but, for instance - run function 
optimizations on unrelated functions at the same time (or CGSCC passes 
on distinct CGSCCs). But this is difficult in LLVM IR because use lists 
are shared - so if two functions use the same global variable or call 
the same 3rd function, optimizing out a function call from each of those 
functions becomes a write to shared state when trying to update the use 
list of that 3rd function. MLIR apparently has a different design in 
this regard that is intended to be more amenable to these situations.
 >> As mentioned on the other thread, the main challenge here is in the
 >> use lists of constant values (which includes also globals and
 >> functions). Right now, those are linked lists that are global for an
 >> entire LLVMContext. Every addition or removal of a use of a constant
 >> has to touch them, and doing such fine-grained locking doesn't seem
 >> like a great idea.
 > GCC has the same issues it terms of certain core structures so not
 > really surprised.
 >>
 >> So this is probably the biggest and seemingly most daunting thing
 >> you'd have to address first, but it's feasible and seems like a good
 >> idea to evolve LLVM IR in a direction where it ends up looking more
 >> like MLIR and can avoid these locks.
 > Sure that makes sense I will see what Johannes wants to start with.

I think addressing this issue first makes sense. I would however start
by determining the actual impact of different design choices here. I
mean, do we know locks will be heavily contented? If I had to guess I'd
say most passes will not create or modify functions nor add or remove
calls. I further guess that passes which create/query llvm::Constant
values will do so for ConstantInt between -1 and 2, I mean most of the
time. This might be wrong but we should for sure check before we
redesign the entire constant handling (as MLIR did). My suggestion is to
profile first. What we want is to monitor the use-list of constants but
I'm not sure if that is easy off the top of my head. What we can do
easily is to print a message in the methods that are used to "create" a
constant, thus the constructors (of llvm::Constant) and the
ConstantXXX::get() methods. We print the pass names and these "constant
generation" messages in a run of the test suite and analyze the result.
What passes create constants, how often, which (kind of) constants, etc.
We should also determine if any pass ever walks the use list of
constants. I know we do it for global symbols but I don't know we do it
for others. That said, I think it is sensible to distinguish global
symbols and other constants at some point because (I think) we use them
differently.

 From there we decide how to move forward. Localized constants, as MLIR
has them, some locking (or similar solution), or maybe just restrictions
on the parallel execution of passes.

I hope this makes some sense.

Cheers,
   Johannes