[llvm-dev] EuroLLVM 2019 - LLVM Binutils BoF notes

Tue Apr 30 03:35:57 PDT 2019

Hi All,

Petr Hosek kindly took notes for the LLVM Binutils BoF from the recent Euro
LLVM, and I am putting them up here for all to see (see below). I'll
separately email around my own notes from the round table that happened the
following day.

Thanks for all the contributions!

James

---

LLVM binary utilities, originally used in testing LLVM components, now
being used more widely as a replacement for GNU binutils.

Lot of bugs coming and being resolved (70 open, 81 resolved in last 2
years). GSoC 2018 project by Paul Semel. 920 commits to LLVM binary
utilities (excluding related libraries).

LLVM tools should be "drop-in replacements" for GNU binutils tools. These
tools are often being used in configure-style scripts, so it's important to
have identical output and switches (even if it means e.g. using special
name for the tool or a flag).

This was an important design point for llvm-objcopy: If GNU objcopy's
behavior makes sense, we copy that behavior, else we try to support the
specific use case. Some use cases may not be supported e.g. because they're
weird, having different name for compatibility reason might be a reasonable
solution, e.g. --strip-all-gnu in llvm-objcopy. GNU objcopy works
differently, it basically relinks the file. LLVM objcopy is architected
differently, but we could always find a way forward.

lld already uses similar mode of operation behaving differently based on
the name (e.g. ld.lld vs lld-link). llvm-readelf is equivalent to
llvm-readobj --elf-output-style=GNU and changes some other flags. Similar
patch for llvm-symbolize and addr2line is currently under review.

Original purpose of LLVM tools was for testing, not for human consumption,
but it seems like based on the number of requests and bugs, we should
provide byte-for-byte identical output. configure scripts often breaks in a
subtle way when the output isn't identical, they rely on human output even
though it's consumed by machine. It might be valuable to support both
machine readable (e.g. JSON) and human readable output (for legacy vs
future scripts).

There are three different types of object files (code generated by GCC or
Clang, GCC LTO and Clang LTO), neither LLVM nor GNU tools handle these
correctly. GNU binutils tools use the plugin to handle LTO, so someone
could write a plugin to handle LLVM's LTO code in binutils and vice-versa
LLVM binutils can support the plugin interface.

What about backward compatibility guarantees? In general breaking backward
compatibility between releases is not that bad as long as there's a path
forward. Best time to break the compatibility is now because not very many
projects use LLVM versions, that won't be the case in the future.

Most tools still use LLVM's cl::opt, haven't moved to tablegen, it depends
on the tool, but in some cases matching binutils would require a complete
re-architecture. Right now we have the best possible opportunity to change
the architecture if we want to because we're still in the ramp up stage.

--help has a lot of options generated from default opt, it makes the output
unusable, could we remove all the useless options to make the help output
more useful? If you're aware of specific instances of this, please file
bugs, people could pick these up as starter bugs since these are usually
very easy to fix.

In case of llvm-objdump -D we're actually disassembling, it's really hard
to match byte-for-byte output, but we can also get much much better than
binutils' objdump output, trying to do identical output may not be worth
it. What would be the improvement? Support for Thumb and Arm in the same
output, constant islands, control-flow visualization, etc.

Is it possible to implement all these tools as library with driver that's
as thin as possible? We don't have a good solution right now for most of
these tools, but we should really aspire to this as a noble goal. Any
patches that get us closer to that goal are very welcome. Writing things as
library first is a great goal.

Each tool currently links in big portion of LLVM backend inflating the
size. Using dynamic linking introduces a significant performance hit. We
could busyboxify LLVM tools to reduce size without getting the performance
hit.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190430/5ed60e7e/attachment.html>