[llvm-dev] DWARF .debug_aranges data objects and address spaces

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Thu Mar 12 11:19:46 PDT 2020


On Thu, Mar 12, 2020 at 11:00 AM Luke Drummond <luke.drummond at codeplay.com>
wrote:

> On Thu Mar 12, 2020 at 5:37 PM, David Blaikie wrote:
> > On Wed, Mar 11, 2020 at 8:09 AM Luke Drummond
> > <luke.drummond at codeplay.com>
> > wrote:
> >
> > > On Tue Mar 10, 2020 at 7:45 PM, David Blaikie wrote:
> > > > If you only want code addresses, why not use the CU's
> > > > low_pc/high_pc/ranges
> > > > - those are guaranteed to be only code addresses, I think?
> > > >
> > > In the common case, for most targets LLVM supports I think you're
> right,
> > > but for my case, regrettably, not. Because my target is a Harvard
> > > Architecture, any code address can have the same ordinal value as any
> > > data address: the code and data reside on different buses so the whole
> > > 4GiB space is available to both code, and data. `DW_AT_low_pc` and
> > > `DW_AT_high_pc` can be used to find the range of the code segment, but
> > > given an arbitrary address, cannot be used to conclusively determine
> > > whether that address belongs to code or data when both segments contain
> > > addresses in that numeric range.
> >
> >
> > Sorry I'm not following, partly probably due to my not having worked
> > with
> > such machines before.
> >
> > But how are the code addresses and data addresses differentiated then
> > (eg:
> > if you had segment selectors in debug_aranges, how would they be used?
> > The
> > addresses taken from the system at runtime have some kind of segment
> > selector associated with them, that you can then use to match with the
> > addr+segment selector in aranges?).
> Yes. This. The system mostly provides us the ability to disambiguate
> addresses because the device's simulator / debugger make this
> unambiguous, but the current .debug_aranges does not allow us to do this
> because it's missing such info.
> >
> > Actually, coming at it from a different angle: It sounds like in the
> > original email you're suggesting if debug_aranges did not contain data
> > addresses, this would be good/sufficient for you? So somehow you'd be
> > ensuring you only query debug_aranges using things you know are code
> > addresses, not data addresses? So why would the same solution/approach
> > not
> > hold to querying low/high/ranges on a CU that's already guaranteed not
> > to
> > contain data addresses?
> That's the root of the issue: the .debug_aranges section emitted by llvm
> *does* contain data addresses by default and therefore can be ambiguous.
> I've worked around this locally by hacking llvm to only emit aranges for
> text objects,


Sorry, but I'm still not understanding why "aranges for only text objects"
is more usable for your use case than "high/low/ranges on the CU"? Could
you help me understand how those are different in your situation?


> but I was wandering if it's something that's valuable to
> fix upstream. My guess is that it's probably too niche to worry about
> for the moment, but if there's interest I can propose a design (probably
> a target hook to ask if segment selectors are required and how to get
> their number from an object).
>

Added a few debug info folks in case they've got opinions. I don't really
mind if we removed data objects from debug_aranges, though as you say, it's
arguably correct/maybe useful as-is. Supporting it properly - probably
using address segment selectors would be fine too, I guess AVR uses address
spaces for its pointers to differentiate data and code addresses? In which
case we could encode the LLVM address space as the segment selector (&
probably would need to query the target to decide if it has non-zero
address spaces and use that to decide whether to use segment selectors in
debug_aranges)

But in general, I'm mostly just discouraging people from using aranges -
the data is duplicated in the CU's ranges anyway (there's some small
caveats there - a producer doesn't /have/ to produce ranges on the CU, but
I'd just say lower performance on such DWARF would be acceptable) & makes
object files/executables larger for minimal value/mostly duplicate data.

- Dave


>
> Thanks for your help
>
> Luke
>
> --
> Codeplay Software Ltd.
> Company registered in England and Wales, number: 04567874
> Registered office: Regent House, 316 Beulah Hill, London, SE19 3HF
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200312/fd9f1a29/attachment.html>


More information about the llvm-dev mailing list