[LLVMdev] Packages

Sun Nov 16 13:31:00 PST 2003

On Sun, 16 Nov 2003, Reid Spencer wrote:

> On Sun, 2003-11-16 at 11:17, Chris Lattner wrote:
>
> > No, it's all or nothing.  Once linked, they cannot be seperated (easily).
> > However, especially when using the JIT, there is little overhead for
> > running a gigantic program that only has 1% of the functions in it ever
> > executed...
>
> Perhaps in the general case, but what if its running on an embedded
> system and the "gigantic program"
> causes an out-of-memory condition?

The JIT doesn't even load unreferenced functions from the disk, so this
shouldn't be the case... (thanks to Misha for implementing this :)

Also, the globaldce pass deletes functions which can never be called by
the program, so large hunks of libraries get summarily removed from the
program after static linking.

> > There are multiple different ways to approach these questions depending on
> > what we want to do and what the priorities are.  There are several good
> > solutions, but for now, everything needs to be statically linked.  I
> > expect this to change over the next month or so.

> When you have time, I'd like to hear what you're planning in this area
> as it will directly effect how I build my compiler and VM.

What do you need, and what would you like?  At this point there are
several solutions that make sense, but they have to be balanced against
practical issues.  For example, say we do IPO across the package, and then
one of the members get updated.  How do we know to invalidate the results?

As I think that I have mentioned before, one long-term way of implementing
this is to attach analysis results to bytecode files as well as the code.
Thus, you could compile libc, say, with LLVM to a "shared object" bytecode
file.  While doing this, the optimizer could notice that "strlen" has no
side-effects, for example, and attach that information to the bytecode
file.

When linking a program that uses libc, the linker wouldn't pull in any
function bodies from "shared objects", but would read the analysis results
and attach them to the function prototypes in the program.  This would
allow the LICM optimizer, to hoist strlen calls out of loops when it makes
sense, for example.

Of course there are situations when it is better to actually link the
function bodies into the program too.  In the strlen example, it might be
the case that the program will go faster if strlen is inlined into a
particular call site.

I'm inclined to start simple and work our way up to these cases, but if
you have certain usage patterns in mind, I would love to hear them, and we
can hash out what will really get implemented...

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/