[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

Fri Jan 18 13:29:58 PST 2013

+ other debug info people (Eric & Paul)

On Fri, Jan 18, 2013 at 1:00 PM, Eli Bendersky <eliben at google.com> wrote:
> Hi All,
>
> While working on some recent patches for x32 support, I ran into an
> unpleasant limitation the LLVM eco-system has with testing DWARF
> emission. We currently have several approaches, neither of which is
> great:
>
> 1. llvm-dwarfdump: the best approach when it works. But unfortunately
> lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
> like debug_frame aren't supported.

Ideally I'd like to see support added whenever a code change is made
to a feature - so long as we hold ourselves to a "test new changes"
that can gate/encourage the necessary feature support in
llvm-dwarfdump.

Since no one's likely to go back & write a bunch of regression tests
for all the existing code it seems premature to add new features to
llvm-dwarfdump before there's a use-case. It does sometimes mean bug
fixes appear to be costly because they include adding the missing test
infrastructure support, but that's essentially where the cost is
anyway.

> 2. Relying of assembly directive emissions (i.e. .cfi_*), which is
> cumbersome and misses a lot of things like actual DWARF encoding.

I'm not sure what you mean by "actual DWARF encoding" here.
(disclaimer: I've only recently started dabbling with debug info, so I
may be missing obvious things)

> 3. Using elf-dump and examining the raw binary dumps. This makes tests
> nearly unmaintainable.
>
> The latter is also why IMHO our ELF emission in general isn't well
> tested. elf-dump is just too rudimentary and relies on simple (=dumb)
> binary contents dumps.
>
> The long-term solution for DWARF would be to enhance lib/DebugInfo to
> the point where it can handle all interesting DWARF sections. But this
> is a lofty goal, since DWARF parsing is notoriously hard and this
> would require a large investment of time and effort. And in the
> meantime, we just don't write good enough tests (and enough of them)
> for this very important feature.

Are there particular recent commits you've been concerned about the
test quality of? I've been trying to keep an eye on this but, again,
don't necessarily fully understand the ramifications of some changes.

> Therefore, as an interim stage, I propose to adopt some external tool
> that parses DWARF and emits decoded textual dumps which makes tests
> easy to write.
>
> Concretely, I have a pure Python library named pyelftools
> (https://bitbucket.org/eliben/pyelftools) which provides comprehensive
> ELF and DWARF parsing capabilities and has a dumper that's fully
> compatible with the readelf command. Using pyelftools would allow us
> to immediately improve the quality of our tests, and as lib/DebugInfo
> matures llvm-dwarfdump can gradually replace the dumper without
> changing the actual tests.

I would be a little hesitant about test execution performance if
involved invoking new python processes for each debug info test. But
numbers could convince me. Beyond that I can't rationally claim any
particular need to support llvm-dwarfdump as the tool of choice over
any 3rd party tool.

> pyelftools is relatively widely used so it's well tested, all it
> requires is Python 2.6 and higher, and its code is in the public
> domain. So it can live in tools/ or test/Scripts or wherever and be
> distributed with LLVM. I actively maintain it and hacking it to LLVM's
> purposes should be relatively easy. As a bonus, it has a much smarter
> ELF parser & dumper that can replace the ad-hoc elf-dump. It has also
> been successfully adapted in the past to read DWARF from MachO files,
> if that's required.
>
> Eli
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev