<div dir="ltr">Personally I prefer always seeing the full function name, even if it's messy. As an aside, have you considered embedding the function name directly as the operand to the jump instruction, as opposed to as a comment? For example, instead of this:<br><div><br></div><div><span style="font-size:13.1999998092651px;line-height:19.7999992370605px">0x107915443 <+51>: callq 0x107d87bb0 ; lldb_private::CommandObject::</span><u style="font-size:13.1999998092651px;line-height:19.7999992370605px"></u><span style="font-size:13.1999998092651px;line-height:19.7999992370605px">G</span><span style="font-size:13.1999998092651px;line-height:19.7999992370605px">etSelectedOrDummyTarget at CommandObject.cpp:1045</span><br></div><div><span style="font-size:13.1999998092651px;line-height:19.7999992370605px"><br></span></div><div><span style="font-size:13.1999998092651px;line-height:19.7999992370605px">we would see this:</span></div><div><span style="font-size:13.1999998092651px;line-height:19.7999992370605px"><br></span></div><div><span style="font-size:13.1999998092651px;line-height:19.7999992370605px">0x107915443 <+51>: callq </span><span style="font-size:13.1999998092651px;line-height:19.7999992370605px">lldb_private::CommandObject::</span><u style="font-size:13.1999998092651px;line-height:19.7999992370605px"></u><span style="font-size:13.1999998092651px;line-height:19.7999992370605px">G</span><span style="font-size:13.1999998092651px;line-height:19.7999992370605px">etSelectedOrDummyTarget</span><span style="font-size:13.1999998092651px;line-height:19.7999992370605px"> (</span><span style="font-size:13.1999998092651px;line-height:19.7999992370605px">0x107d87bb0)</span><span style="line-height:19.7999992370605px;font-size:13.1999998092651px"> ; </span><span style="line-height:19.7999992370605px;font-size:13.1999998092651px"> [CommandObject.cpp:1045]</span></div><div><span style="line-height:19.7999992370605px;font-size:13.1999998092651px"><br></span></div><div><span style="line-height:19.7999992370605px;font-size:13.1999998092651px">I find this much easier to read, because unless you're a magician who thinks in hexadecimal, chances are you are more interested in the name than the address, so your eyes will always have to scroll to the right to find the function name. This way your eyes sometimes don't have to scroll to the right.</span></div></div><br><div class="gmail_quote">On Wed Feb 11 2015 at 9:12:49 PM Jason Molenda <<a href="mailto:jmolenda@apple.com">jmolenda@apple.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">In r219544 (2014-10-10) I changed the default disassembly format to more closely resemble gdb's disassembly format. After living on this format for a few months, there are obvious shortcomings with C++ and Objective-C programs and I want to try a new approach.<br>
<br>
Originally lldb's disassembly would display the Module & Function/Symbol name on a line by itself when a new function/symbol began, on each line of assembly display the file/load address followed by opcode, operands, and comments (e.g. showing the target of a branch insn). Branches to the same function would have a comment listing the full function name plus an offset. Note that the addresses did not display the offset, just raw addresses, meaning you had to compare the full address of the branch target with the disassembly output to find the target of the branch. When the branch target was in inlined code, lldb would print all of the inlined functions in the comment field (on separate lines).<br>
<br>
In October I changed this to more closely resemble gdb's output: Each line has the file/load address, the function name, the offset into the function ("+35"), opcode, operand, comment. Comments pointing to the same function behaved the same but inlined functions were not included. I try to elide function argument types (e.g. from a demangled C++ name) but with templated methods it can be enormous.<br>
<br>
This style of disassembly looks pretty good for short C function names. Like<br>
<br>
(lldb) disass -c 20<br>
0x7fff94fbe188 <mach_msg_trap>: movq %rcx, %r10<br>
0x7fff94fbe18b <mach_msg_trap+3>: movl $0x100001f, %eax<br>
0x7fff94fbe190 <mach_msg_trap+8>: syscall<br>
-> 0x7fff94fbe192 <mach_msg_trap+10>: retq<br>
0x7fff94fbe193 <mach_msg_trap+11>: nop<br>
0x7fff94fbe194 <mach_msg_overwrite_trap>: movq %rcx, %r10<br>
<br>
but as soon as you get a hefty C++ name in there, it becomes very messy:<br>
<br>
0x107915454 <CommandObjectBreakpointList::<u></u>DoExecute+68>: jne 0x1be9331 ; CommandObjectBreakpointList::<u></u>DoExecute + 113 at CommandObjectBreakpoint.cpp:<u></u>1420<br>
<br>
Or, an extreme example that I found in lldb with 30 seconds of looking (function name only) -<br>
<br>
std::__1::function<std::__1::<u></u>shared_ptr<lldb_private::<u></u>TypeSummaryImpl> (lldb_private::ValueObject&)>:<u></u>:function<<u></u>CommandObjectTypeSummary::<u></u>CommandObjectTypeSummary(lldb_<u></u>private::CommandInterpreter&):<u></u>:'lambda'(lldb_private::<u></u>ValueObject&)><br>
<br>
<br>
I want to go with a hybrid approach between these two styles. When there is a new symbol, we print the full module + function name. On each assembly line, we print the file/load address, the offset into the function in angle brackets, opcode, operand, and in the comments branches to the SAME function follow the <+36> style. An example:<br>
<br>
<br>
(lldb) disass<br>
LLDB`<u></u>CommandObjectBreakpointList::<u></u>DoExecute:<br>
0x107915410 <+0>: pushq %rbp<br>
0x107915411 <+1>: movq %rsp, %rbp<br>
0x107915414 <+4>: subq $0x170, %rsp<br>
0x10791541b <+11>: movq %rdi, -0x20(%rbp)<br>
0x10791541f <+15>: movq %rsi, -0x28(%rbp)<br>
0x107915423 <+19>: movq %rdx, -0x30(%rbp)<br>
0x107915427 <+23>: movq -0x20(%rbp), %rdx<br>
-> 0x10791542b <+27>: movq %rdx, %rsi<br>
0x10791542e <+30>: movb 0x165(%rdx), %al<br>
0x107915434 <+36>: andb $0x1, %al<br>
0x107915436 <+38>: movq %rsi, %rdi<br>
0x107915439 <+41>: movzbl %al, %esi<br>
0x10791543c <+44>: movq %rdx, -0xf8(%rbp)<br>
0x107915443 <+51>: callq 0x107d87bb0 ; lldb_private::CommandObject::<u></u>GetSelectedOrDummyTarget at CommandObject.cpp:1045<br>
0x107915448 <+56>: movq %rax, -0x38(%rbp)<br>
0x10791544c <+60>: cmpq $0x0, -0x38(%rbp)<br>
0x107915454 <+68>: jne 0x107915481 ; <+113> at CommandObjectBreakpoint.cpp:<u></u>1420<br>
0x10791545a <+74>: leaq 0xf54d21(%rip), %rsi ; "Invalid target. No current target or breakpoints."<br>
0x107915461 <+81>: movq -0x30(%rbp), %rdi<br>
0x107915465 <+85>: callq 0x107d93640 ; lldb_private::<u></u>CommandReturnObject::<u></u>AppendError at CommandReturnObject.cpp:135<br>
0x10791546a <+90>: movl $0x1, %esi<br>
0x10791546f <+95>: movq -0x30(%rbp), %rdi<br>
0x107915473 <+99>: callq 0x107d93760 ; lldb_private::<u></u>CommandReturnObject::SetStatus at CommandReturnObject.cpp:172<br>
0x107915478 <+104>: movb $0x1, -0x11(%rbp)<br>
0x10791547c <+108>: jmp 0x1079158bd ; <+1197> at CommandObjectBreakpoint.cpp:<u></u>1470<br>
0x107915481 <+113>: movq -0x38(%rbp), %rdi<br>
<br>
The main drawback for this new arrangement is that you may be looking at a long series of instructions and forget the name of the function/method. You'll need to scroll backwards to the beginning of the disassembly to find this function's names. Minor details include doing a two-pass over the instruction list to find the maximum length of the address component and padding all the lines so the opcodes line up. For instance,<br>
<br>
(lldb) disass -c 30 -n mach_msg_trap<br>
libsystem_kernel.dylib`mach_<u></u>msg_trap:<br>
0x7fff94fbe188 <+0>: movq %rcx, %r10<br>
0x7fff94fbe18b <+3>: movl $0x100001f, %eax<br>
0x7fff94fbe190 <+8>: syscall<br>
0x7fff94fbe192 <+10>: retq<br>
0x7fff94fbe193 <+11>: nop<br>
<br>
dyld`mach_msg_trap:<br>
0x7fff6a867210 <+0>: movq %rcx, %r10<br>
0x7fff6a867213 <+3>: movl $0x100001f, %eax<br>
0x7fff6a867218 <+8>: syscall<br>
0x7fff6a86721a <+10>: retq<br>
0x7fff6a86721b <+11>: nop<br>
<br>
<br>
The disassembly format can be overridden by the 'disassembly-format' setting if people have specific preferences. But I think this new hybrid style of disassembly will work the best as a default given the kinds of method names we see with OO languages.<br>
<br>
Comments? I'd like to land this in a couple days if no one feels strongly about it.<br>
<br>
REPOSITORY<br>
rL LLVM<br>
<br>
<a href="http://reviews.llvm.org/D7578" target="_blank">http://reviews.llvm.org/D7578</a><br>
<br>
Files:<br>
include/lldb/Core/Address.h<br>
include/lldb/Core/<u></u>Disassembler.h<br>
include/lldb/Core/<u></u>FormatEntity.h<br>
include/lldb/Symbol/<u></u>SymbolContext.h<br>
source/API/SBInstruction.cpp<br>
source/API/SBInstructionList.<u></u>cpp<br>
source/Breakpoint/<u></u>BreakpointLocation.cpp<br>
source/Commands/<u></u>CommandObjectSource.cpp<br>
source/Core/Address.cpp<br>
source/Core/Debugger.cpp<br>
source/Core/Disassembler.cpp<br>
source/Core/FormatEntity.cpp<br>
source/Plugins/Disassembler/<u></u>llvm/DisassemblerLLVMC.cpp<br>
source/Plugins/UnwindAssembly/<u></u>InstEmulation/<u></u>UnwindAssemblyInstEmulation.<u></u>cpp<br>
source/Symbol/SymbolContext.<u></u>cpp<br>
source/Symbol/Variable.cpp<br>
source/Target/StackFrame.cpp<br>
source/Target/<u></u>ThreadPlanTracer.cpp<br>
test/functionalities/<u></u>abbreviation/<u></u>TestAbbreviations.py<br>
test/functionalities/inferior-<u></u>assert/TestInferiorAssert.py<br>
<br>
EMAIL PREFERENCES<br>
<a href="http://reviews.llvm.org/settings/panel/emailpreferences/" target="_blank">http://reviews.llvm.org/<u></u>settings/panel/<u></u>emailpreferences/</a><br>
______________________________<u></u>_________________<br>
lldb-commits mailing list<br>
<a href="mailto:lldb-commits@cs.uiuc.edu" target="_blank">lldb-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits" target="_blank">http://lists.cs.uiuc.edu/<u></u>mailman/listinfo/lldb-commits</a><br>
</blockquote></div>