[llvm-dev] Multi-Threading Compilers

Wed Mar 18 06:49:26 PDT 2020

On Wed, Mar 18, 2020 at 7:23 AM Nicholas Krause via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> On 3/3/20 8:37 PM, Chris Lattner wrote:
>
> On Feb 28, 2020, at 6:03 PM, Chris Lattner <clattner at nondot.org> wrote:
>
>
> On Feb 28, 2020, at 8:56 AM, Johannes Doerfert <johannesdoerfert at gmail.com> wrote:
>
> On 02/28, Nicholas Krause via llvm-dev wrote:
>
> Anyhow what is the status and what parts are we planning to move to
> MLIR in LLVM/Clang.  I've not seen any discussion on that other than
> starting to plan for it.
>
>
> As far as I know, there is no (detailed/discussed/agreed upon/...) plan
> to move any existing functionality in LLVM-Core or Clang to MLIR. There
> are some people that expressed interest in there is Chris's plan on how
> the transition could look like.
>
>
> Yep, agreed, I gave a talk a couple days ago (with Tatiana) with a proposed path forward, but explained it as one possible path.  We’ll share the slides publicly in a few days after a couple things get taken care of.
>
>
> Hi all,
>
> Here is a link to the CGO presentation slides (outlining a possible path to incremental adoption of MLIR in Clang) for anyone curious.
>
> -Chris
>
> Greetings,
> As to David Blaike's suggestion I'm merging the two threads for this discussion. The original commenters is Johannes Doefert
> starting with Hey,:
>
> Hey,
>
> Apologies for the wait, everything right now is going crazy..
>
> Compiler Folks are very busy people as there aren't as much of us unfortunately so no need to
> apologize. I've yet to heard from someone on the GCC side and will wait until after GCC 11
> is released due to this. Also not to mention the health issues of Coronavirus-19.
>
>
> I think we should early in move this conversation on the llvm Dev list but generally speaking we can see three options here:
> 1) parallelize single passes or a subset of passes that are known to not interfer, e.g. the attributor,
> 2) parallelize analysis pass execution before a transformation that needs them,
>
> 3) investigate what needs to be done for a parallel execution of many passes, e.g. How can we avoid races on shared structure such as the constant pool.
>
> I was researching this on and off for the last few months in terms of figuring out how to make the pass manager itself async. Its not easy and I'm not even
> sure if that's possible. Not sure about GIMPLE as I would have to ask the middle end maintainer on the GCC side but LLVM IR does not seem to have shared
> state detection or the core classes and same for the back ends. So yes this would interest me.
>
> The first place to start with is which data structures are shared for sure. The biggest ones seem to be basic blocks and function definitions in terms of shared state, as
> those would be shared by passes running on each function.  We should start looking at implementing here locks or ref counting here first if your OK with that.
> It also allows me  to understand a little more concrete the linkage between the core classes as would be required for multi threading LLVM. In addition,
> it allows us to look into partitioning issues with threads at the same thing in terms of how to do it.
>
> As was discussed on the previous thread - generally the assumption is that one wouldn't try to run two function optimizations on the same function at the same time, but, for instance - run function optimizations on unrelated functions at the same time (or CGSCC passes on distinct CGSCCs). But this is difficult in LLVM IR because use lists are shared - so if two functions use the same global variable or call the same 3rd function, optimizing out a function call from each of those functions becomes a write to shared state when trying to update the use list of that 3rd function. MLIR apparently has a different design in this regard that is intended to be more amenable to these situations.

As mentioned on the other thread, the main challenge here is in the
use lists of constant values (which includes also globals and
functions). Right now, those are linked lists that are global for an
entire LLVMContext. Every addition or removal of a use of a constant
has to touch them, and doing such fine-grained locking doesn't seem
like a great idea.

So this is probably the biggest and seemingly most daunting thing
you'd have to address first, but it's feasible and seems like a good
idea to evolve LLVM IR in a direction where it ends up looking more
like MLIR and can avoid these locks.

Cheers,
Nicolai

-- 
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.