[PATCH] Add -bare option to llvm-objdump

Wed Sep 10 20:29:39 PDT 2014

On Wed, Sep 10, 2014 at 3:03 PM, Steve King <steve at metrokings.com> wrote:

> On Wed, Sep 10, 2014 at 2:14 PM, Sean Silva <chisophugis at gmail.com> wrote:
> >
> > On Wed, Sep 10, 2014 at 1:38 PM, Steve King <kingshizzle at gmail.com>
> wrote:
> >>
> >> The -bare option keeps objdump output consistent with test expectations
> >> regardless of host platform.  As discussed on the list, default
> llvm-objdump
> >> output could change per host:  GNU style for Linux, otools for Apple,
> etc.
> >
> >
> > I think this needs some discussion in LLVMDev. I don't see the point in
> > slaving to reproduce the system tool's output when we don't match the
> > command-line options.
> >
>
> Command line mismatch is a good point.  IMHO, matching the commonly
> used command line knobs and typical output style would be good enough.
>

Right, you mean like otool's -d (" Display the contents of the
(__DATA,__data) section") or -t ("Display  the contents of the
(__TEXT,__text) section.  With the -v flag, this disassembles the text.")?
;)

llvm-objdump just happens to sort of resemble GNU objdump to some extent.
What is your goal? From what I can gather, it seems like you just want to
make llvm-objdump's default behavior match GNU objdump, which doesn't seem
like a very good direction (how deep is the rabbit hole? why GNU objdump
and not tool or the windows one?). I do see a lot of value in making
llvm-objdump's default output be as useful as possible; I don't see much
value in blindly making the default behavior as close to GNU objdump as
possible. In fact, if we want a really high-quality output, taking GNU
objdump as a model seems like a poor idea anyway; off the top of my head I
can think of multiple ways it could be improved:

- put the symbol name at the beginning of the line (note that llvm-objdump
already gets this right!)
- have options to disassemble sub-parts of the object file (like section
ranges, or single symbols), instead of just the entire file, to avoid
wasting a bunch of time printing stuff (there have been days where my
workflow was bottlenecked on a minutes-long objdump call, when I was trying
to just get a single function...)
- do not ever wrap "raw-insn" output.
- print section offsets in a consistent format (e.g. always the same number
of digits, padding with 0's) so that they are easily searchable
- don't suck for interoperability with awk and other tools; consider this
which works fine with otool's -tv format: awk '$1 ~ /foo/ {p=1} $1 ~ /:$/
&& $1 !~ /foo/ {p=0} p'; a similar thing works for printing ranges. An
easily-digested format is very useful for doing various sorts of analyses
(instruction types, instruction length, average per-function instruction
length, stack traffic analysis, instruction size distribution, etc.). There
are many analyses that I haven't done due to GNU objdump's output format
sucking. In general, otool's output format is extremely good in this
regard, and if anything, IMO we should use *it* as a model (unfortunately
otool works in such a limited set of scenarios (basically just Mac
binaries); if we could have that format with llvm-objdump's
disassemble-every-object-format, every-architecture capabilities, that
would be great).

> I've helped colleagues with llvm-objdump vs. GNU objdump on one past
> project already and the bar isn't that high.  The -bare option at
> least opens the door to per-host defaults.
>

> One a practical note, analyzing llvm-objdump test failures is labor
> intensive and a bit nerve wracking since i'm not familiar with many of
> the target architectures.

If you have questions, please ask on IRC or the mailing lists. A solution
that doesn't make sense from the community's point of view (like a quick
hack -bare to avoid needing to look at how you are affecting tests) will
get ripped out: just recently you saw me do so for a number of objdump
options.

I recommend that you first look into changing these tests to use
-filetype=asm (where possible) so that they don't even need to use objdump
in the first place.

>   The -bare option allows us to change output
> in purely aesthetic ways without breaking tests.

Is it worth burdening every developer in the LLVM project to remember to
use -bare in order to get consistent output for tests? We already have
enough word-of mouth poorly-documented testing "best practices" (like <%s
for file input to opt). I personally don't think that it's worth adding a
new one just "to change output in purely aesthetic ways".

-- Sean Silva
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140910/a7e54a2a/attachment.html>