[LLVMdev] LLVM Archive Format Extension Proposal

Thu Dec 6 07:34:39 PST 2012

On Dec 5, 2012, at 9:10 PM, Michael Spencer <bigcheesegs at gmail.com<mailto:bigcheesegs at gmail.com>> wrote:

On Mon, Dec 3, 2012 at 2:08 PM, Relph, Richard <Richard.Relph at amd.com<mailto:Richard.Relph at amd.com>> wrote:
On Nov 21, 2012, at 4:28 PM, "Relph, Richard" <Richard.Relph at amd.com<mailto:Richard.Relph at amd.com>> wrote:

On Nov 21, 2012, at 12:09 PM, Michael Spencer <bigcheesegs at gmail.com<mailto:bigcheesegs at gmail.com>> wrote:

Note that I plan to remove llvm/Bitcode/Archive once Object/Archive is
capable of replacing it. The llvm tools that don't write archives
files have already been switched over to it. Object/Archive already
supports MemoryBuffer as a source for the data.

I had meant to ask in my email about the apparent duplication of Archive in Bitcode and Object libs… Good to know. Since ranlib currently uses Bitcode, that's what I've been focusing on, but I had noticed the Object/Archive.h.

Michael,
   I understand and agree that having 2 Archive implementations is something that should be fixed. Do you have a rough idea about when you might do the unification?
   Also, why unify around the Object/Archive implementation instead of the Bitcode/Archive implementation? What can the Object/Archive implementation "do" that can't be done with the Bitcode implementation?
   I ask because after looking at Archive in Object and Archive in Bitcode, the Archive in Bitcode seems much better documented than the Archive in Object, and feels (at least to me at first glance) like a somewhat better model of what Archives are. And as you've already noted, Object/Archive can't do writes...

Richard

I wrote Object/Archive for a couple reasons. The main reason was
performance. Bitcode/Archive parses the entire archive file up front
including the symbol table. Object/Archive does it lazily and uses
much less memory.

I guess performance measurement depends on use… For what ar and the Linker do, using some memory to structure the archive's symbol table up front is worth the performance gain. As I read the Object/Archive implementation, each search for a symbol does a straight linear scan of the archive symbol table in it's serialized form in memory… That's probably not very performant for the archives I have in mind - hundreds of modules with many thousands of symbols.

There are two static methods in Bitcode/Archive… one does read the entire archive (OpenAndLoad), but the other reads only through the usual early members, including the symbol table (OpenAndLoadSymbols). The Linker class uses this API, while ar and ranlib uses OpenAndLoad.

The other reason is that Bitcode/Archive is heavily
focused on bitcode files. It even requires an LLVMContext to
construct. This is was not optimal for my object file needs.

I think we could refactor Bitcode/Archive relatively simply to avoid requiring the Context at construction time, and require it only for methods that want a Module returned from Archive. We could change the constructor to accept a pointer to a Context instead of a reference and use a default of 0. We could then add an optional parameter to the few operations that require a Context. And if at the point we actually need a Context, we still don't have one, use getGlobalContext() to acquire one.

It feels like what's desired is splitting Bitcode/Archive in two… a low-level, LLVM-neutral piece, and a higher-level, LLVM-dependent piece. Eliminating Context from the constructor goes part way to making Bitcode/Archive usable for low-level uses. Eliminating the "automatic" production of the symbol table requires a bit more thought, but should be doable without too much effort. There are really only a handful of public methods that depend on it, and they have a lot of overlap with those that need a Context. Maybe to support the slow, low-memory search of the symbol table member, we could add an additional method to do that. But I haven't found any users of Object/Archive's findSym() method yet, so maybe we don't need that at all.

And, of course, the static method to produce an Archive from memory instead of a file, which is pretty straight forward.

What do you think?

Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121206/e789c635/attachment.html>