[llvm-dev] DWARF .debug_aranges data objects and address spaces

Luke Drummond via llvm-dev llvm-dev at lists.llvm.org
Thu Mar 12 11:00:11 PDT 2020

On Thu Mar 12, 2020 at 5:37 PM, David Blaikie wrote:
> On Wed, Mar 11, 2020 at 8:09 AM Luke Drummond
> <luke.drummond at codeplay.com>
> wrote:
> > On Tue Mar 10, 2020 at 7:45 PM, David Blaikie wrote:
> > > If you only want code addresses, why not use the CU's
> > > low_pc/high_pc/ranges
> > > - those are guaranteed to be only code addresses, I think?
> > >
> > In the common case, for most targets LLVM supports I think you're right,
> > but for my case, regrettably, not. Because my target is a Harvard
> > Architecture, any code address can have the same ordinal value as any
> > data address: the code and data reside on different buses so the whole
> > 4GiB space is available to both code, and data. `DW_AT_low_pc` and
> > `DW_AT_high_pc` can be used to find the range of the code segment, but
> > given an arbitrary address, cannot be used to conclusively determine
> > whether that address belongs to code or data when both segments contain
> > addresses in that numeric range.
> Sorry I'm not following, partly probably due to my not having worked
> with
> such machines before.
> But how are the code addresses and data addresses differentiated then
> (eg:
> if you had segment selectors in debug_aranges, how would they be used?
> The
> addresses taken from the system at runtime have some kind of segment
> selector associated with them, that you can then use to match with the
> addr+segment selector in aranges?).
Yes. This. The system mostly provides us the ability to disambiguate
addresses because the device's simulator / debugger make this
unambiguous, but the current .debug_aranges does not allow us to do this
because it's missing such info.
> Actually, coming at it from a different angle: It sounds like in the
> original email you're suggesting if debug_aranges did not contain data
> addresses, this would be good/sufficient for you? So somehow you'd be
> ensuring you only query debug_aranges using things you know are code
> addresses, not data addresses? So why would the same solution/approach
> not
> hold to querying low/high/ranges on a CU that's already guaranteed not
> to
> contain data addresses?
That's the root of the issue: the .debug_aranges section emitted by llvm
*does* contain data addresses by default and therefore can be ambiguous.
I've worked around this locally by hacking llvm to only emit aranges for
text objects, but I was wandering if it's something that's valuable to
fix upstream. My guess is that it's probably too niche to worry about
for the moment, but if there's interest I can propose a design (probably
a target hook to ask if segment selectors are required and how to get
their number from an object).

Thanks for your help


Codeplay Software Ltd.
Company registered in England and Wales, number: 04567874
Registered office: Regent House, 316 Beulah Hill, London, SE19 3HF

More information about the llvm-dev mailing list