[PATCH] Usability improvements for Intel X86 disassembly (llvm)

Richard Mitton richard at codersnotes.com
Mon Jul 29 23:00:11 PDT 2013


That all makes a lot of sense, thank you. I'll try and resubmit as 
separate patches tomorrow.

The motivation behind the asm-styles switch is to provide an output 
style to match MASM (or TASM etc). MASM syntax is generally considered 
the de facto syntax for assembly, and many programmers are used to it. 
The Intel manuals themselves use the same syntax for hexadecimal 
literals, for example. I agree that any changes made should also be 
accepted by the parser, but it doesn't seem entirely reasonable that one 
part can't get improved without everything having to be improved at once.

I don't think it's entirely true that there's one single canonical form 
for printing. While yes, there is a default, which should match the 
expected input format, we do have options like AT&T vs Intel, decimal 
vs. hex, etc. I think it's important to provide a good human-usable 
interface as well as a 'machine-readable' one.

Thanks,

Jim Grosbach wrote:
>    Hi Richard,
>
>    Thanks for working on this. Getting better assembly output from the compiler is a very good thing.
>
>    Some high-level comments to start with.
>
>    "immediate values can be selected to appear as either hex or decimal
>    register names and keywords lowercased to be consistent with both att syntax and general sanity.
>    annotated markup is supported in the disassembly (same as att)"
>
>    These are all great changes, but are independent of one another. If it's not too much to ask, it'd be best to separate these out into three smaller patches which each do only one of these things.
>
>    "fixed a bug where MOV16016a would disassembly wrongly for Intel."
>
>    Likewise, this is great, but should be a separate patch.
>
>    "added a switch for the user to select which style of hex constants they would prefer (c/pascal/asm style)"
>
>    Can you elaborate on this a bit? For example, can the assembly parser handle all three styles? If not, then that would be part of adding support for this. It's important that the compiler be able to assemble its own output. More broadly, what's the motivation here? Typically we do this the other way around, accepting a variety of forms for input in the parser but choosing a single canonical form for printing. There are obviously exceptions, but that's the general approach, so when we deviate from that, it's worth talking a bit about why and whether it's worth it.
>
> http://llvm-reviews.chandlerc.com/D1221
>    



More information about the llvm-commits mailing list