[llvm-dev] GSoC19 - LLVM JIT Infrastructure

Mon Mar 18 20:09:22 PDT 2019

Hi Preejackie,

Sorry for the delayed reply. I was quite busy over the weekend.

1) Is there any hard contract that the JIT should compile a full module
> (basic unit) into native code, or it can extract only hot functions from
> the module and compile it to native code, and leave the client of JIT
> (maybe a interpreter) interpret the remains of that module? Or it's the
> client's job to pack all hot functions into module and transfer that to JIT
> layer? If the latter is case, how JIT can benefit from speculating (I'm bit
> lost here).

It is possible to extract only hot functions. The mental model for doing
this is slightly complicated though, because of the way ORC’s design has
been influenced by a desire to integrate well with static compilers (like
LLVM): The JIT always starts materialization of an entire
MaterializationUnit (e.g. IR Module, object, etc.). This mirrors the way
static compilers work — they always compiling a whole module at a time. In
ORC however, clients are free to write custom MaterializationUnits that
break up the underlying module representation and only continue compilation
of part of it.

I have come up with some idea for the speculation task:
> 1) Construct the local(per-module) function ordering list based on the
> sequence of their call in control flow graphs and create stub for each
> symbol from other module that is referenced by the current function, put
> the module in waiting list.

> You should not need to worry about stub creation directly. ORC manages
this via the lazy re-exports utility. You configure two JITDylibs, one
(libFoo in this example) that everyone will link to, and a second
(libFoo.impl) that will contain the actual definitions. For each
definition, the lazyReexports utility will build a stub in libFoo which,
when it is executed, will trigger a lookup of the corresponding symbol in
libFoo.impl, then jump to it.

  stubs:                           implementations:
+--------+                         +-------------+
| libFoo |                         | libFoo.impl |
+========+                         +-------------+
|  foo   |                         |     foo     |
|  bar   | -- on call, looks up -> |     bar     |
|  baz   |                         |     baz     |
+--------+                         +-------------+

So for the speculation project you will want to track this association.
When you see someone reference "bar" in libFoo, and you determine that it
would be profitable to speculatively compile the body of "bar", you would
consult your tracker and find that the implementation of this function is
provided by "bar" in libFoo.impl. You would then issue a lookup for "bar"
in libFoo.impl and discard the result. This will trigger compilation of the
body of bar. ORC will manage synchronization so that it doesn't matter
whether your lookup for speculation comes before, after, or during any
other lookup of "bar": it will only be compiled once, and nobody will call
into it until it is ready.

These speculation actions should be buried into internals of concurrent
> compiler and should not be visible to clients right?

Actually, ORC is designed as a library of components. The aim would be to
build components that support speculation so that other people can
optionally use your work to add speculation to their compiler.

A good test environment would be the LLI llvm-interpreter: This uses the
JIT to execute LLVM IR, and would provide a good test platform for
speculation.

Cheers,
Lang.

Sent from my iPad

On Mar 15, 2019, at 1:44 PM, preejackie <praveenvelliengiri at gmail.com>
wrote:

Hi Lang,

As I'm going through the design & code of ORC JIT. Although I didn't
completely understand it. I have some few questions regarding it.

As from the demo in your talk "Updating ORC for concurrency" I have noticed
currently the ORC has addModule(...) function, which gives the jit access
to compile the IR module into native code.

1) Is there any hard contract that the JIT should compile a full module
(basic unit) into native code, or it can extract only hot functions from
the module and compile it to native code, and leave the client of JIT
(maybe a interpreter) interpret the remains of that module? Or it's the
client's job to pack all hot functions into module and transfer that to JIT
layer? If the latter is case, how JIT can benefit from speculating (I'm bit
lost here).

I have come up with some idea for the speculation task:

1) Construct the local(per-module) function ordering list based on the
sequence of their call in control flow graphs and create stub for each
symbol from other module that is referenced by the current function, put
the module in waiting list.

2) Replace the stub with actual native code address and notify the modules
in the waiting list.

The creation of function ordering list involves analysing the control flow
graphs and branch probability for conditional function calls. I'm also
trying to figure out whether the function atttributes will fit in this
picture + more like using module summaries from thinlto builds.

These speculation actions should be buried into internals of concurrent
compiler and should not be visible to clients right?

How I can proceed with plan, I'm open to comments!

Also, I didn't get any responses for the post I wrote in llvm-dev list yet.
IMHO it is difficult to get this project through GSoC without a mentor. Do
you have any thoughts over it?

I will highly appreciate your kind help:)

On 14/03/19 4:15 AM, preejackie wrote:

Hi Lang,
On 13/03/19 4:55 AM, Lang Hames wrote:

Hi Perejackie,

As far as I know, many literature's speak about speculative optimization
for Ahead of time compilation (please correct me if I'm wrong).

I am actually not sure what speculative optimization would mean in the
context of ahead of time optimization. Possibly it could refer to the use
of profile guided optimization to optimize for a certain set of assumptions
(derived from the profiling information) with a fallback for when those
assumptions do not hold.

The kind of speculative compilation that I am talking about only makes
sense in the context of a JIT compiler that supports multiple threads of
compilation. In that case the JIT compiler is free to run ahead of the
actual execution, compiling functions that it has determined will be
called, or are likely to be called, in the future. The challenges to do
this in LLVM are

(1) Infrastructure: making it possible to speculate at all. Right now all
the foundational code is there, but the APIs and glue code to make it work
are not there yet, and indeed not fully designed yet.

(2) Making good speculation decisions: Even once you *can* speculate, you
still want to speculate well. This would involve program analysis on higher
level program representations (e.g. LLVM IR for a proof-of-concept), and
could also involve profiling data gathered for profile guided optimizations.

I understand and working a finding heuristics for better speculation.

It would be a great help if you point me to some references that you found
interesting & relevant to speculative compilation.

I have not actually done a literature search yet. The bulk of my recent
work focused on simply building the infrastructure for safe multi-threaded
JIT compilation in LLVM.

Okay, no probelm.

I have mailed the list with subject : Improving Speculative compilation in
concurrent orc jit. Please see that.

Thanks a lot

Cheers,
Lang.

Sent from my iPad

On Mar 12, 2019, at 2:27 PM, preejackie <praveenvelliengiri at gmail.com>
wrote:

Hi Lang,

Thank you very much for your reply.

Yeah, I think it would be nice to refer to some existing literature on the
subject. I also started a thread in Numba JIT project (which uses llvm) to
see whether they have anything implemented up-front. As far as I know, many
literature's speak about speculative optimization for Ahead of time
compilation (please correct me if I'm wrong).

Of course, I wish to pursue this and I will write this in the dev list as
soon as possible with some basic idea + plan and seek for mentors.

It would be a great help if you point me to some references that you found
interesting & relevant to speculative compilation.

Thanks
 On 13/03/19 12:11 AM, Lang Hames wrote:

Hi Preejackie,

I would like to help, but am concerned that my limited time and lack of
prior experience with GSoC will make me a less than ideal mentor. I would
encourage you to search for a mentor on list, but I will try to answer as
many of your questions as I can.

Regarding speculative compilation: This has definitely been done before,
and there should be existing literature on making good speculation
decisions. While that is worth looking at, I would be inclined to start out
by building a framework for testing the ideas, starting with very basic
speculation decisions (e.g. only compiling unconditionally called code) and
then go from there. That gives you a basis for comparing your techniques.
The part of the exercise that is of the most immediate interest is coming
up with an API to fit speculation into the existing ORC infrastructure.
This is something I have basic ideas about, but nothing detailed yet.

If you do wish to pursue this we should aim to have a conversation on the
LLVM dev list about how to design it. That way the whole community will be
able to follow along, and it will serve as useful documentation.

Cheers,
Lang.

On Tue, Mar 12, 2019 at 10:41 AM preejackie <praveenvelliengiri at gmail.com>
wrote:

> Hi Lang,
>
> I'm following up on this mail. So that you don't lost it in  your schedule
> :) I have started working on finding heuristics for Speculative compilation
> support.
>
> Are you available this summer to mentor this project? Or I can take it to
> the list, to find anyone interested.
>
> Thank you
>
> On 08/03/19 9:31 PM, preejackie wrote:
>
> Dear Lang,
>
> I forgot to ask you, whether you are available this summer to mentor this
> project?
>
> Thanks
>
>
> On 08/03/19 1:59 AM, preejackie wrote:
>
> Hi Lang,
>
> Thank you very much for your reply.
>
> Both better caching support and speculative compilation support ideas are
> very interesting and hope that they have a very good scope within llvm
> community. I will bit more investigate both the project ideas in detail and
> choose one to write the proposal :). I didn't gone through the technical
> complexities in detail, so I want to ask whether these projects can be
> completed within 3 months with a realistic timeline?
>
> Once we get to concrete idea on design and implementation, I would like to
> RFC on the llvm-dev listing and I wish to do that as soon as possible. I
> googled for docs on llvm jit, but I can only able to find few articles.
> Could you please point me to some good (beginner friendly) articles :)
>
> In mean time, I will also checkout if any compilers did something similar
> to this. While working on this, If I get any idea i will tell you.
>
> And sorry for the delayed reply, I was engaged in my university classes.
>
>
> On 07/03/19 2:53 AM, Lang Hames wrote:
>
> Hi Praveen,
>
> Sorry for the delayed reply.
>
> You're the first person to ask about JIT specific projects in a while. I
> will have a think about it and see what I can come up with.
>
> Off the top of my head, I can tell you a few of the things that are on my
> to-do and wish lists. Let me know if any of these interest you:
>
> TODO list:
> (1) A C API for the new, concurrent ORC (ORCv2) APIs. This one might be a
> bit simple for GSoC, but it's one that LLVM users would appreciate, and is
> a good high-level introduction to the new APIs.
>
> (2) Update Kaleidoscope tutorials, including the "Building A JIT" series,
> to use concurrent ORC.
>
> (3) Better caching support. ORC can re-use relocatable object files from
> previous compilations, so if a JIT input hasn't changed we can just re-use
> the old object file (assuming we saved it somewhere). LLVM already has a
> basic form of support for this in llvm::ObjectCache (
> http://llvm.org/doxygen/classllvm_1_1ObjectCache.html) but it would be
> nice to develop a more general scheme, rather than being tied to LLVM IR
> input. For example, it would be nice if, when JITing Swift, if we could
> recognize Swift code (in the form of text, ASTs, or SIL) that we've already
> compiled, bypass LLVM altogether   and go straight to the object file.
>
> (4) Speculative compilation support. One of the selling points of the
> concurrent ORC APIs is that you can start compiling a function before you
> need it, in the hope that by the time that you do need it it is already
> compiled. However, if we just speculatively compile everything we're
> quickly going to overload our memory/CPU resources, which will be lousy for
> performance. What we really want to do is to take into account any
> information we have (from high level program representations, e.g. ASTs or
> CFGs, or from profiling data from previous executions) to try to guess what
> functions are worth speculatively compiling next.
>
> Wish list:
> (1) JITLink support for ELF and/or COFF. JITLink is a super-new (it's
> still in review: https://reviews.llvm.org/D58704) replacement for
> RuntimeDyld, the JIT linker that currently supports MCJIT and ORC. JITLink
> aims to provide a cleaner API that integrates better with the concurrent
> ORC APIs, dead-symbol-stripping support, and small code model support. I'm
> working on MachO support, but would love a hand with ELF and COFF support.
>
> I'm sure more ideas will come to me over time. :)
>
> Do any of those projects interest you?
> Is there any aspect of the JIT that you are particularly keen to work on,
> or to learn more about?
>
> Kind Regards,
> Lang.
>
>
> On Tue, Mar 5, 2019 at 12:48 PM preejackie <praveenvelliengiri at gmail.com>
> wrote:
>
>> Hi Lang,
>>
>> I'm Praveen Velliengiri masters student from India. I would like to
>> participate in GSoC19 with llvm. I'm very much interested in
>> contributing to the JIT infrastructure of llvm. As a first step, I have
>> gone through your talk titled "ORC- LLVM next gen JIT API" and some docs
>> on JIT infrastructure. I have followed "Kaleidoscope , writing a llvm
>> pass" tutorials and currently following "Implementing a JIT in LLVM" to
>> make myself familiar with LLVM JIT Infrastructure.
>>
>> I find it really interesting and hope that I can learn more about JIT
>> compilation and LLVM in general. But unfortunately I can't find any
>> project ideas that is related to LLVM JIT infrastructure in GSoC ideas
>> list. I'm relatively new to LLVM, I find it hard to frame a project idea
>> myself. I would like to know is there any improvements that can be made
>> to LLVM JIT ? If so, could please point me in right direction or bugs so
>> that I can able to find if there is any opportunities to improve JIT
>> infrastructure. It will greatly help me to propose that project idea for
>> this GSoC 2019.
>>
>> I would highly appreciate your thoughts on this!
>> Thanks a lot for your time :)
>>
>> --
>> Have a great day!
>> PreeJackie
>>
>> --
> Have a great day!
> PreeJackie
>
> --
> Have a great day!
> PreeJackie
>
> --
> Have a great day!
> PreeJackie
>
> --
Have a great day!
PreeJackie

-- 
Have a great day!
PreeJackie

-- 
Have a great day!
PreeJackie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190318/1e3d4996/attachment-0001.html>