[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

Fri Jan 18 14:17:59 PST 2013

On Fri, Jan 18, 2013 at 1:00 PM, Eli Bendersky <eliben at google.com> wrote:
> Hi All,
>
> While working on some recent patches for x32 support, I ran into an
> unpleasant limitation the LLVM eco-system has with testing DWARF
> emission. We currently have several approaches, neither of which is
> great:
>
> 1. llvm-dwarfdump: the best approach when it works. But unfortunately
> lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
> like debug_frame aren't supported.
> 2. Relying of assembly directive emissions (i.e. .cfi_*), which is
> cumbersome and misses a lot of things like actual DWARF encoding.
> 3. Using elf-dump and examining the raw binary dumps. This makes tests
> nearly unmaintainable.
>
> The latter is also why IMHO our ELF emission in general isn't well
> tested. elf-dump is just too rudimentary and relies on simple (=dumb)
> binary contents dumps.
>
> The long-term solution for DWARF would be to enhance lib/DebugInfo to
> the point where it can handle all interesting DWARF sections. But this
> is a lofty goal, since DWARF parsing is notoriously hard and this
> would require a large investment of time and effort. And in the
> meantime, we just don't write good enough tests (and enough of them)
> for this very important feature.
>
> Therefore, as an interim stage, I propose to adopt some external tool
> that parses DWARF and emits decoded textual dumps which makes tests
> easy to write.
>
> Concretely, I have a pure Python library named pyelftools
> (https://bitbucket.org/eliben/pyelftools) which provides comprehensive
> ELF and DWARF parsing capabilities and has a dumper that's fully
> compatible with the readelf command. Using pyelftools would allow us
> to immediately improve the quality of our tests, and as lib/DebugInfo
> matures llvm-dwarfdump can gradually replace the dumper without
> changing the actual tests.
>
> pyelftools is relatively widely used so it's well tested, all it
> requires is Python 2.6 and higher, and its code is in the public
> domain. So it can live in tools/ or test/Scripts or wherever and be
> distributed with LLVM. I actively maintain it and hacking it to LLVM's
> purposes should be relatively easy. As a bonus, it has a much smarter
> ELF parser & dumper that can replace the ad-hoc elf-dump. It has also
> been successfully adapted in the past to read DWARF from MachO files,
> if that's required.
>
> Eli

I'm fine with this as long as llvm-dwarfdump gets maintained.

The only problem is that LLVM does not require Python 2.6, I think the
min version is still 2.4. Although I would love to move to 2.6 :P

- Michael Spencer