[llvm-commits] [PATCH] Object File Library

Fri Nov 12 18:17:17 PST 2010

Michael,

This ObjectFile interface has a very simple model of object files as a sequence of symbols and sections.  How are you thinking of doing Normalization?  Will it be layered on top of this interface, or an orthogonal interface?  

If on top, then this interface will need to be much richer.  If orthogonal, then perhaps this interface should not be named "ObjectFile", but "SimpleObjectFile" or "RawObjectFile".  Something so that "ObjectFile" can we used as the normalized interface.

Since in the current world the platform nm tool and file format are paired, what model should we follow for llvm-nm?  Should it display the output like the current OS for any object file format?  Or should the output of nm reflect the file being parsed and not the OS you are on?    Here is a simple file and compiled  on Darwin and FreeBSD and the output of the platform nm tool:

[/tmp]> cat foo.c
int a;  
int b = 5;
static int c;
static int d = 4;
int foo() { return a + b +c + d; }

[freebsd] > cc foo.c -c
[freebsd] > nm foo.o
0000000000000004 C a
0000000000000000 D b
0000000000000000 b c
0000000000000004 d d
0000000000000000 T foo

[darwin]> cc foo.c -c
[darwin]> nm foo.o
0000000000000030 s EH_frame1
0000000000000004 C _a
0000000000000028 D _b
0000000000000078 b _c
000000000000002c d _d
0000000000000000 T _foo
0000000000000048 S _foo.eh

Conceptually these files contain the same simple C code, but we can already see the platforms and file formats have diverged on symbol names (prefix), extra symbols (unwind info), values, etc.

-Nick

> From: Michael Spencer <bigcheesegs at gmail.com>
> Date: November 11, 2010 6:23:34 PM PST
> To: llvm-commits <llvm-commits at cs.uiuc.edu>
> Subject: [llvm-commits] [PATCH] Object File Library
> 
> Attached are updated patches to be reviewed. I've split them up into
> the generic API, the COFF and ELF implementations, tool changes, and
> tests.
> 
> This does not currently implement the Serialization/Normalization
> split that I reference in my talk. I'm going to wait to do that split
> until the line is firmly defined, as I still haven't figured out how
> exactly to represent relocation data.
> 
> A current major blocker that you can see throughout the ELF
> implementation (and ignored in COFF (which I knew much better and
> didn't need debugging help)) is how invalid object file errors are
> handled. I currently just call report_fatal_error, which is really not
> how they should be handled. I've been complaining about the same
> problem in the System library on IRC recently too while trying to
> clean up the Windows impl.
> 
> What I would like to do is use an error object based on the concept of
> std::error_code to report both system (IO, memory, syscall), and
> "parsing" errors from invalid object files. This should also include
> at least the address in the file at which the error occurred. I don't
> feel that we need detailed (clang style :P) diagnostics because tools
> generate object files, not people. At the same time I don't want to
> just dump "object file invalid at 0xdeadbeef" to the user. Using
> 'std::error_category' would allow mixing the two while proving more
> detailed info such as "invalid symbol name string index at
> 0xblahblah". Also, each object file could use a custom error_category
> for special errors. It also allows clients that simply don't care to
> just check for success and not pay for message formatting.
> 
> - Michael Spencer
<object-S0P0-add-LLVMObject.patch><object-S0P1-add-COFF-support.patch><object-S0P2-add-ELF-support.patch><object-S0P3-llvm-nm.patch><object-S0P4-llvm-objdump.patch><object-S0P5-add-tests.patch>
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits