[LLVMdev] Upstreaming PNaCl's IR simplification passes

Thu Mar 6 18:09:00 PST 2014

There's a lot of questions in your post, so I'll focus on the technical
questions about specific IR passes in this first reply...

On 4 March 2014 15:17, Chandler Carruth <chandlerc at google.com> wrote:

> On Tue, Mar 4, 2014 at 1:04 PM, Mark Seaborn <mseaborn at chromium.org>wrote:
>
>> Some background:  There are two related use cases for these IR
>> simplification passes:
>>
>
>>  1) Simplifying the task of writing a new LLVM backend.  This is
>> Emscripten's use case.  The IR simplification passes reduce the number of
>> cases a backend has to handle, so they would be useful for anyone else
>> creating a new backend.
>>
>
> If these simplify writing a backend, why wouldn't the patches include
> commensurate simplifications to LLVM's backends? That would both give them
> an in-tree customer, and more immediate value to the community and project
> as a whole.
>

That's a good question.  I'll have to have a look around in the LLVM
backend code and see what parts could be replaced by one of PNaCl's
simplification passes.

One answer is that, in some cases, such as calling conventions and global
constructor arrays, LLVM's backend is constrained to follow the ABIs for
particular OSes and architectures.  Compatibility makes complexity harder
to remove.  I'll elaborate more below.  This only applies to a few of
PNaCl's IR passes though.

>
>
>>  2) Using a subset of LLVM IR as a stable distribution format for
>> portable executables.  This is PNaCl's use case.  PNaCl's IR subset omits
>> various complex IR features, which we lower using the IR simplification
>> passes [2].  Renderscript is an example of another project that uses IR as
>> a stable distribution format, though I think currently Renderscript is not
>> subsetting IR much.
>>
>
> Given that the bitcode is stable, I don't understand why this is important.
>

Is the bitcode format stable now?  I heard talk that LLVM is trying to do
this now, but I don't remember seeing an llvmdev thread stating that for
sure.  Was there a thread about it that I missed?  I just remember hearing
complaints last year that the format was still getting changed. :-)

>  * Calling conventions lowering:  ExpandVarArgs and ExpandByVal lower
>> varargs and by-value argument passing respectively.  They would be useful
>> for any backend that doesn't want to implement varargs or by-value calling
>> conventions.
>>
>
> Why wouldn't these be applicable to existing backends? What is hard about
> the existing representations?
>

For the calling conventions lowering passes, you wouldn't want to use them
in backends that have to match some existing architecture-specific ABI for
calling conventions.  For example, if you use ExpandVarArgs on x86, your .o
file won't be able to successfully call the printf() function provided by
libc.so, because the varargs calling conventions won't match.

But for many targets that is not an issue, either because:

 * there is no existing architecture-specific ABI that LLVM must match, or
 * you're using static linking, or can make similar "closed world"
assumptions, so that a module can use any calling conventions as long as
they're used consistently within the module.

Both of these are true for PNaCl and Emscripten.

My suspicion is that one or both of these conditions will be true for other
novel backends, such as for specialised architectures like GPUs.

Aside from PNaCl and Emscripten, I am less familiar with other novel
backends.  So one of the things I had hoped to learn from this discussion
was whether other backends would find these passes useful.  So far we've
had some people say that yes, they would.

>   * Instruction-level lowering:
>>     * ExpandStructRegs splits up struct values into scalars, removing the
>> "insertvalue" and "extractvalue" instructions.
>>
>
> There are already passes that do this outside of function arguments and
> return values. Why is a new one needed?
>

Are you referring to the work that SelectionDAGBuilder.cpp does to convert
insertvalue/extractvalue to a SelectionDAG?  I don't think there's an
IR-to-IR pass in LLVM for doing this, is there?

The reason PNaCl needs an IR-to-IR pass is that PNaCl's stable IR omits
insertvalue/extractvalue, in order to keep the format simple and reduce the
set of constructs that a PNaCl translator implementation needs to handle.
 The reason Emscripten's fastcomp uses ExpandStructRegs is to keep
Emscripten's backend simple, in the context that it doesn't use lib/CodeGen.

And the reason we have to handle insertvalue/extractvalue at all is largely
that Clang outputs them for uses of C++ method pointers.  Otherwise,
structs-as-registers aren't really used.  At least, that was the case in
3.3 -- maybe some more uses have appeared since then.

> How do you handle the overflow-detecting operations?
>

PNaCl has the ExpandArithWithOverflow pass, which lowers uses of
llvm.*.with.overflow.*.

>
>     * PromoteIntegers legalizes integer types (e.g. i30 is converted to
>> i32).
>>
>
> Does it split up too-wide integers?
>

PNaCl's version currently doesn't.  Emscripten's fastcomp has a version
which splits up 64-bit integer operations into 32-bit operations, which
they need because Javascript doesn't support 64-bit integer arithmetic.

PNaCl's version didn't need to do that because we were happy to support
64-bit arithmetic in PNaCl's stable ABI.  However, we did find that unusual
C bitfields caused Clang to generate integer types larger than 64-bit
(which we don't support in PNaCl's stable ABI), so we started implementing
a pass to split those up.  We should probably sync up with Emscripten and
reuse their code for that.

> Do we really want another integer legalization framework in LLVM?
>

At the risk of not answering your question directly, LLVM already has two
instruction selectors, SelectionDAG and FastISel.  So another question
might be, when is it OK to have multiple implementations that perform
similar tasks using different approaches, and when is it not OK?  What are
the trade-offs involved here?

> I am actually interested in doing (partial) legalization in the IR during
> lowering (codegenprep time) in order to simplify the backend, but I don't
> think we should develop such a framework independently of the legalization
> currently used in the backends.
>
>
>>
>>  * Module-level lowering:  This implements, at the IR level,
>> functionality that is traditionally provided by "ld".  e.g. ExpandCtors
>> lowers llvm.global_ctors to the __init_array_start and __init_array_end
>> symbols that are used by C libraries at startup.
>>
>
> This doesn't make any sense to me. The IR representation is strictly
> simpler. It is trivially lowered in a backend. I don't understand what this
> would benefit.
>

To elaborate:  In PNaCl, pexes are statically linked modules in which
running global constructors is handled by user code inside the pexe.  The
special llvm.global_ctors array isn't part of PNaCl's stable subset of IR,
because there's no need for it to be.  Running constructors is done in
normal IR by the pexe's entry point, without constructors needing to be
handled specially by PNaCl's IR format.

LLVM's global_ctors construct is incomplete:  it provides a mechanism, at
the IR level, to declare functions to be run at startup, but it assumes
that running these functions will be done by a runtime library.  At the IR
level, LLVM doesn't provide a way to implement a runtime library that can
read that constructor list.  ld linker scripts provide a way to do that --
e.g. on Linux, see /usr/lib/ldscripts/elf_i386.x, which defines
__init_array_{start,end} -- but that's not at the IR level.

ExpandCtors just provides a mechanism for a runtime library to list the
constructor functions, purely at the IR level, without constructors having
to be a special feature in the PNaCl ABI or in the Emscripten backend.

 There seems to be plenty of precedent for IR-to-IR lowering passes -- LLVM
>> already contains passes such as LowerInvoke, LowerSwitch and LowerAtomic.
>>
>
> Note that these are quite different -- they lower from a front-end
> convenient form toward the canonical IR form.
>

Those three passes don't lower towards canonical IR form -- unless we are
taking "canonical IR form" to mean quite different things?

LowerInvoke and LowerAtomic both strip out information irreversibly.

LowerAtomic "lowers atomic intrinsics to non-atomic form for use in a known
non-preemptible environment".  LowerInvoke strips out exception handling by
converting invokes to calls, so that landingpads, resumes, etc. become dead
and can be removed by a later pass.

(As an aside, LowerInvoke has an option for using SJLJ exception handling,
but that option appears to be unused and replaced
by lib/CodeGen/SjLjEHPrepare.cpp.)

LowerSwitch "rewrites switch instructions with a sequence of branches,
which allows targets to get away with not implementing the switch
instruction until it is convenient".

These three are very similar in function to PNaCl's IR simplification
passes, since they reduce the set of language features that must be
supported by a backend or by a stable IR format.

> You are talking about something totally different that deals with
> target-oriented lowering. The correct place to look for analogies is
> CodeGenPrep.
>

CodeGenPrepare.cpp just contains optimisations, doesn't it?  It doesn't
lower any language features such that the feature is removed from the
module, so it doesn't seem to be analogous to PNaCl's IR simplification
passes, which do do that.  e.g. LowerAtomic strips out atomicrmw entirely
so that anything processing LowerAtomic's output doesn't have to handle
atomicrmw at all.  Similarly, ExpandByVal expands out "byval" entirely.

If you're looking for backend IR-to-IR passes which lower language
features, DwarfEHPrepare and SjLjEHPrepare are analogous to PNaCl's passes.
 DwarfEHPrepare only lowers resume instructions, while SjLjEHPrepare
handles more.

Cheers,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140306/1e945ce5/attachment.html>