[llvm-dev] [llvm-readobj][RFC]Making llvm-readobj GNU command-line compatible

Tue Nov 6 04:52:50 PST 2018

Hi all,

A broad goal of many of the LLVM binary tools, such as llvm-objcopy and
llvm-objdump is to provide an alternative to the GNU equivalent, and as
such, these tools have been developed to be command-line compatible. One
tool where this hasn’t been the case up to now is llvm-readobj (aka
llvm-readelf).

There was some discussion in https://reviews.llvm.org/D33872 about the
purpose of llvm-readobj, so I’d like to ask the community's opinion. What
is the purpose of llvm-readobj? Is it purely intended as an aid to testing?
Should it be aiming to be GNU compatible, like most of the rest of the LLVM
tools?

The main issue I discovered with GNU compatibility is that llvm-readobj has
a few incompatible command-line flags with different interpretations
between the two tools:

* -s means dump symbols in GNU readelf, but dump sections in llvm-readobj
* -t means dump section details in GNU readelf, but dump symbols in
llvm-readobj
* -a means dump all in GNU readelf, but dump arm attributes in llvm-readobj

There are also several missing aliases and some missing features, but we
can implement those with no negative impact on the users of llvm-readobj,
so I won't discuss those here.

Also of relevance here are long options preceded with only a single dash.
My understanding of GNU’s behaviour is that each letter following it is
treated as a different option, whereas in llvm-readobj, we get one single
option (e.g. ‘readobj -abc’ would be equivalent to ‘readobj -a -b -c’, but
‘llvm-readobj -abc’ is equivalent to ‘llvm-readobj --abc’). This is at
least partly related to the cl::opt/libOption issues discussed in
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127328.html).

I'd like to propose that we fix the three switches above such that they
match GNU readelf's interpretation, and to change short-option handling
similarly. This would inevitably result in some test churn (there are
approximately 200 tests between core llvm and lld that would need
updating), but it is manageable. More of an issue is that any users would
suddenly find the switches changing on them, if they have started using
llvm-readobj. On the other hand, I think the benefit for those used to GNU
readelf outweighs the cost.

We could do a few different things to mitigate the impact of changing over,
roughly in my order of preference, if we decide against just taking the
plunge and changing the meaning:

1) For the next release, add a deprecation warning, saying that the
switches’ meanings will be changed in a following release, and then fix it
after the next release has been created, along with release notes
documenting the change.
2) Provide a “--gnu-mode” or similar switch that changes the meaning of the
command-line switches above to match the GNU mode. This again provides an
opt-in, but also allows downstream ports to enable it by default, should
they wish.
3) Change the meaning of the switches only for llvm-readelf, and not for
llvm-readobj. This is similar to the behaviour of --elf-output-style: it is
GNU for llvm-readelf, and LLVM for llvm-readobj, but does have essentially
the same potential for disrupting users as 1).
4) Provide a third user-facing driver (e.g. “llvm-gnu-readelf”) that
provides a different CLI to the others. This makes it an opt-in feature, by
using a different executable.
5) Just accept this divergence, although I personally would prefer not to,
as this has the potential to confuse users migrating from GNU tools to LLVM
tools.

Thoughts?

James
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181106/0ec3b503/attachment-0001.html>