[PATCH] D32492: [llvm-dwarfdump] - Change format for .gdb_index dump.

Wed Apr 26 14:42:18 PDT 2017

I don't think that any tool currently depends on parsing that syntax and I agree that using the mathematical [x,y) notation makes most sense here. It would be great to wrap this in a formatDWARFrange() function, so we can be consistent everywhere (and in case it really turns out that I need to produce darwin-dwarfdump compatible output at one point, we have one point where this can easily be parameterized).

-- adrian

> On Apr 26, 2017, at 2:26 PM, Robinson, Paul <paul.robinson at sony.com> wrote:
> 
> I'd have a mild preference for [x, y) and also okay with x – y without the [) around the pair.  The [x – y) does just look funny. Although of course in context it's clear what's intended, so it's just a mild preference.
> --paulr
>   <>
> From: llvm-commits [mailto:llvm-commits-bounces at lists.llvm.org] On Behalf Of David Blaikie via llvm-commits
> Sent: Wednesday, April 26, 2017 10:41 AM
> To: reviews+D32492+public+d04ad76ab213883a at reviews.llvm.org; grimar at accesssoftek.com; dccitaliano at gmail.com; rafael.espindola at gmail.com; Adrian Prantl
> Cc: llvm-commits at lists.llvm.org
> Subject: Re: [PATCH] D32492: [llvm-dwarfdump] - Change format for .gdb_index dump.
>  
>  
> 
> On Wed, Apr 26, 2017 at 3:31 AM George Rimar via Phabricator <reviews at reviews.llvm.org <mailto:reviews at reviews.llvm.org>> wrote:
> grimar added inline comments.
> 
> 
> ================
> Comment at: lib/DebugInfo/DWARF/DWARFGdbIndex.cpp:42
>      OS << format(
> -        "    Low address = 0x%llx, High address = 0x%llx, CU index = %d\n",
> -        Addr.LowAddress, Addr.HighAddress, Addr.CuIndex);
> +        "    Low/High address = [0x%llx, 0x%llx] (Size: 0x%llx), CU id = %d\n",
> +        Addr.LowAddress, Addr.HighAddress, Addr.HighAddress - Addr.LowAddress,
> ----------------
> dblaikie wrote:
> > While you're here, probably makes sense to fix the range to render as [x, y) rather than [x, y] ? (since it's a half open range, if I recall correctly)
> I am not sure I understand why it is half open. If I got your idea correctly then probably output would be:
> ```
> [a, b) (c, d) (e, f) (g, h]
> 
> Nah, more like:
> 
> [a, b)
> [c, d)
> [e, f)
> [g, h)
>  
> ```
> 
> But I do not think it would be correct.
> .gdb_index format (https://sourceware.org/gdb/onlinedocs/gdb/Index-Section-Format.html <https://sourceware.org/gdb/onlinedocs/gdb/Index-Section-Format.html>) defines address area as:
> > The address area. The address area consists of a sequence of address entries. Each address entry has three elements:
> > The low address. This is a 64-bit little-endian value.
> > The high address. This is a 64-bit little-endian value. Like DW_AT_high_pc, the value is one byte beyond the end.
> 
> This bit here ^ "the value is one byte beyond the end" is what I mean by a half open range.
> 
> If a range in a range list includes bytes at offset/address 5, 6, 7, and 8. Then the range would be emitted as "5, 9" which is half open or written mathematically [5, 9)
> 
> ( https://en.wikipedia.org/wiki/Interval_(mathematics) <https://en.wikipedia.org/wiki/Interval_(mathematics)> - "A half-open interval includes only one of its endpoints, and is denoted by mixing the notations for open and closed intervals. (0,1] means greater than 0 and less than or equal to 1, while [0,1) means greater than or equal to 0 and less than 1.")
>  
> > The CU index. This is an offset_type value.
> >
> 
> So we have set of elements that have begining and end marks. It looks to be set of closed segments/snippets, if I am not missing something, so I think [x, y] is more appropriate then.
> 
> I was interested how readelf dumps the .gdb_index for the same file used in testcase.
> It was:
> readelf --debug-dump=gdb_index dwarfdump-gdbindex-v7.elf-x86-64
> ```
> Contents of the .gdb_index section:
> Version 7
> 
> CU table:
> [  0] 0x0 - 0x33
> [  1] 0x34 - 0x67
> 
> TU table:
> 
> Address table:
> 00000000004000e8 00000000004000f3 0
> 00000000004000f3 00000000004000fe 1
> 
> Symbol table:
> [489] main: 0 [global, function]
> [754] int:
>         0 [static, type]
>         1 [static, type]
> [956] main2: 1 [global, function]
> ```
> 
> Basing on above I can suggest 2 solutions:
> 1) Leave [x, y] as is.
> 2) Change dump output to something that avoids use of []() if that can be confusing, for example to:
> 
> ```
> Address area offset = 0x38, has 2 entries:
> 0x4000e8 - 0x4000f3, Size = 0xb, CU id = 0
> 0x4000f3 - 0x4000fe, Size = 0xb, CU id = 1
> ```
> 
> What do you think ?
> 
> '-' could be ambiguous with the minus sign here.
> 
> Indeed llvm-dwarfdump uses the half open range notation [x, y) in the DW_AT_ranges dumping such as here:
> 
>   DW_AT_ranges [DW_FORM_sec_offset] (0x00000000
>      [0x00000000004004f0 - 0x00000000004004f6)
>      [0x0000000000400500 - 0x0000000000400508))
>  
> though it also uses - rather than ','... which is weird.
> 
> Adrian - what do you reckon we should standardize on here? I do sort of get the x - y syntax, but it seems a bit awkward when combined with the [x, y) syntax (which I also like, for clarity about the boundaries). I'd tend to lean towards [x, y)? But I could live with sticking with the [x - y)
> 
> 
> 
> https://reviews.llvm.org/D32492 <https://reviews.llvm.org/D32492>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170426/d222ff06/attachment.html>