[PATCH] Add -bare option to llvm-objdump

Thu Sep 11 00:11:14 PDT 2014

On Wed, Sep 10, 2014 at 8:29 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
> Right, you mean like otool's -d (" Display the contents of the
> (__DATA,__data) section") or -t ("Display  the contents of the
> (__TEXT,__text) section.  With the -v flag, this disassembles the text.")?
> ;)
>

I've never used otools, but presumably Kevin has some idea here.

> What is your goal?

A symbolizing disassembler with at least the possibility to tune
output to match existing norms per host platform.

> From what I can gather, it seems like you just want to
> make llvm-objdump's default behavior match GNU objdump, which doesn't seem
> like a very good direction (how deep is the rabbit hole? why GNU objdump and
> not tool or the windows one?).

What I personally need is a just symbolizing disassembler.  What my
colleagues will expect is a fair approximation of GNU objdump because
that's what they already know.  Kevin piped up early that he wanted
otools style output, so we ended up at the goal above.

> I do see a lot of value in making
> llvm-objdump's default output be as useful as possible; I don't see much
> value in blindly making the default behavior as close to GNU objdump as
> possible.

Right, I'm not suggesting this either.

> In fact, if we want a really high-quality output, taking GNU
> objdump as a model seems like a poor idea anyway; off the top of my head I
> can think of multiple ways it could be improved:
>
> - put the symbol name at the beginning of the line (note that llvm-objdump
> already gets this right!)
> - have options to disassemble sub-parts of the object file (like section
> ranges, or single symbols), instead of just the entire file, to avoid
> wasting a bunch of time printing stuff (there have been days where my
> workflow was bottlenecked on a minutes-long objdump call, when I was trying
> to just get a single function...)
> - do not ever wrap "raw-insn" output.
> - print section offsets in a consistent format (e.g. always the same number
> of digits, padding with 0's) so that they are easily searchable
> - don't suck for interoperability with awk and other tools; consider this
> which works fine with otool's -tv format: awk '$1 ~ /foo/ {p=1} $1 ~ /:$/ &&
> $1 !~ /foo/ {p=0} p';

I wish I had the chops to consider this awk line :-)

>  a similar thing works for printing ranges. An
> easily-digested format is very useful for doing various sorts of analyses
> (instruction types, instruction length, average per-function instruction
> length, stack traffic analysis, instruction size distribution, etc.).

I am really with you here.  I've hatched more than a few Perl scripts
to chew objdump output.

> There
> are many analyses that I haven't done due to GNU objdump's output format
> sucking. In general, otool's output format is extremely good in this regard,
> and if anything, IMO we should use *it* as a model

Disassemblers are down and dirty tools often needed by folks with weak
software backgrounds, e.g. hardware designers.  For this crowd,
elegant output matters less than dispensing with software tasks the
way they already know how.  Offering familiar behavior by default is a
win, but that doesn't mean we can't produce better output with a
command line option, argv[0] check or whatever.

>
> If you have questions, please ask on IRC or the mailing lists. A solution
> that doesn't make sense from the community's point of view (like a quick
> hack -bare to avoid needing to look at how you are affecting tests) will get
> ripped out: just recently you saw me do so for a number of objdump options.
>

Right, I'm invested in building a consensus.  I'll disagree with you
that -bare is a hack -- how else will tests handle default behavior
changes per platform?

> I recommend that you first look into changing these tests to use
> -filetype=asm (where possible) so that they don't even need to use objdump
> in the first place.
>

I'm trying to avoid more mission creep when I have an alternative,
e.g. -bare.  This engagement is already a heck of lot bigger than when
first stuck my nose in here :-)

>>
>>   The -bare option allows us to change output
>> in purely aesthetic ways without breaking tests.
>
>
> Is it worth burdening every developer in the LLVM project to remember to use
> -bare in order to get consistent output for tests? We already have enough
> word-of mouth poorly-documented testing "best practices" (like <%s for file
> input to opt). I personally don't think that it's worth adding a new one
> just "to change output in purely aesthetic ways".
>

I hear you, but again, how do we allow per platform defaults without
breaking tests?  Maybe you disagree that we should have per-platform
defaults?