[llvm-commits] PATCH: Fix ELFObjectFile::getSymbolAddress which make llvm-nm work incorrectly on executables

Michael Spencer bigcheesegs at gmail.com
Fri Jun 22 12:49:07 PDT 2012


On Fri, Jun 22, 2012 at 3:11 AM, Alexey Samsonov <samsonov at google.com> wrote:
> Hi!
>
> libObject seems to incorrectly implement
> ELFObjectFile::getSymbolAddress. See this reproducer:
> $ cat main.cc
> int main() {
>   return 0;
> }
> $ g++ main.cc -o main.out
> $ nm main.out | grep main
>                  U __libc_start_main@@GLIBC_2.2.5
> 00000000004004b4 T main
> $ llvm-nm main.out | grep main
>          U __libc_start_main@@GLIBC_2.2.5
> 00800884 T main
>
> Let's try to get what's wrong:
> 800884 - 4004b4 = 4003d0
> $ objdump -h main.out | grep .text
>  11 .text         000001c8  00000000004003d0  00000000004003d0  000003d0
>  2**4
>
> So, the symbol address is incorrectly incremented by the section offset. To
> my understanding, attached patch should be applied to fix this. Please check
> if this is ok to apply.
> getSymbolFileOffset in the same file seems to be fine, at least according to
> this quote from ELF specs:
>
> Symbol table entries for different object file types have slightly different
> interpretations for the st_value member.
> <...>
> * In relocatable files, st_value holds a section offset for a defined
> symbol. That is, st_value is an offset from the beginning of the section
> that st_shndx identifies.
> * In executable and shared object files, st_value holds a virtual address.
> [...]
>
> --
> Alexey Samsonov, MSK
>

I agree that llvm-nm is incorrect here, but I'm not sure this is the
correct fix. The issue is that exactly what getSymbolAddress is
supposed to return is undocumented. There was quite a bit of
discussion about it in "[llvm-commits] MachOObjectFile fix functions",
but even after reading it I'm not 100% sure what it should do.  This
patch also doesn't seem to handle the difference between a relocatable
file and an executable.

I've CCed the people from the above thread. I would like to decide on
a well defined meaning for all of the Address/Offset functions and
document that in the code before we change anything, as I believe the
ELF MCJIT is relying on the current behavior.

- Michael Spencer




More information about the llvm-commits mailing list