[LLVMdev] Packages

Sun Nov 16 14:56:01 PST 2003

Chris Lattner wrote:
> On Sun, 16 Nov 2003, Reid Spencer wrote:
> 
> 
>>On Sun, 2003-11-16 at 11:17, Chris Lattner wrote:
>>
>>
>>>No, it's all or nothing.  Once linked, they cannot be seperated (easily).
>>>However, especially when using the JIT, there is little overhead for
>>>running a gigantic program that only has 1% of the functions in it ever
>>>executed...
>>
>>Perhaps in the general case, but what if its running on an embedded
>>system and the "gigantic program"
>>causes an out-of-memory condition?
> 
> 
> The JIT doesn't even load unreferenced functions from the disk, so this
> shouldn't be the case... (thanks to Misha for implementing this :)
> 
> Also, the globaldce pass deletes functions which can never be called by
> the program, so large hunks of libraries get summarily removed from the
> program after static linking.
> 
> 
>>>There are multiple different ways to approach these questions depending on
>>>what we want to do and what the priorities are.  There are several good
>>>solutions, but for now, everything needs to be statically linked.  I
>>>expect this to change over the next month or so.
> 
> 
>>When you have time, I'd like to hear what you're planning in this area
>>as it will directly effect how I build my compiler and VM.
> 
> 
> What do you need, and what would you like?  At this point there are
> several solutions that make sense, but they have to be balanced against
> practical issues.  For example, say we do IPO across the package, and then
> one of the members get updated.  How do we know to invalidate the results?
> 
> As I think that I have mentioned before, one long-term way of implementing
> this is to attach analysis results to bytecode files as well as the code.
> Thus, you could compile libc, say, with LLVM to a "shared object" bytecode
> file.  While doing this, the optimizer could notice that "strlen" has no
> side-effects, for example, and attach that information to the bytecode
> file.
> 

While on the subject of annotating bytecode with analysis info, could I 
entice someone to also think about carrying other types of source-level 
annotations through into bytecode ? This is particularly useful for 
situations where one wants to use LLVM infrastructure for its 
whole-program optimization capabilities, however wouldn't want to give 
up on the ability to debug the final product binary. At the moment, my 
understanding is that source code annotations like file names, line 
numbers etc isn't carried through. When one gets around to linking the 
whole program, you end up with a single .s file of native machine code 
(which by now is a giant collection of bits picked up from a multitude 
of source files) with no ability to do symbolic debugging on the 
resulting binary...

> When linking a program that uses libc, the linker wouldn't pull in any
> function bodies from "shared objects", but would read the analysis results
> and attach them to the function prototypes in the program.  This would
> allow the LICM optimizer, to hoist strlen calls out of loops when it makes
> sense, for example.
> 
> Of course there are situations when it is better to actually link the
> function bodies into the program too.  In the strlen example, it might be
> the case that the program will go faster if strlen is inlined into a
> particular call site.
> 
> I'm inclined to start simple and work our way up to these cases, but if
> you have certain usage patterns in mind, I would love to hear them, and we
> can hash out what will really get implemented...
> 
> -Chris
>