[llvm-dev] DWARF .debug_aranges data objects and address spaces

Luke Drummond via llvm-dev llvm-dev at lists.llvm.org
Thu Mar 12 11:00:11 PDT 2020


On Thu Mar 12, 2020 at 5:37 PM, David Blaikie wrote:
> On Wed, Mar 11, 2020 at 8:09 AM Luke Drummond
> <luke.drummond at codeplay.com>
> wrote:
>
> > On Tue Mar 10, 2020 at 7:45 PM, David Blaikie wrote:
> > > If you only want code addresses, why not use the CU's
> > > low_pc/high_pc/ranges
> > > - those are guaranteed to be only code addresses, I think?
> > >
> > In the common case, for most targets LLVM supports I think you're right,
> > but for my case, regrettably, not. Because my target is a Harvard
> > Architecture, any code address can have the same ordinal value as any
> > data address: the code and data reside on different buses so the whole
> > 4GiB space is available to both code, and data. `DW_AT_low_pc` and
> > `DW_AT_high_pc` can be used to find the range of the code segment, but
> > given an arbitrary address, cannot be used to conclusively determine
> > whether that address belongs to code or data when both segments contain
> > addresses in that numeric range.
>
>
> Sorry I'm not following, partly probably due to my not having worked
> with
> such machines before.
>
> But how are the code addresses and data addresses differentiated then
> (eg:
> if you had segment selectors in debug_aranges, how would they be used?
> The
> addresses taken from the system at runtime have some kind of segment
> selector associated with them, that you can then use to match with the
> addr+segment selector in aranges?).
Yes. This. The system mostly provides us the ability to disambiguate
addresses because the device's simulator / debugger make this
unambiguous, but the current .debug_aranges does not allow us to do this
because it's missing such info.
>
> Actually, coming at it from a different angle: It sounds like in the
> original email you're suggesting if debug_aranges did not contain data
> addresses, this would be good/sufficient for you? So somehow you'd be
> ensuring you only query debug_aranges using things you know are code
> addresses, not data addresses? So why would the same solution/approach
> not
> hold to querying low/high/ranges on a CU that's already guaranteed not
> to
> contain data addresses?
That's the root of the issue: the .debug_aranges section emitted by llvm
*does* contain data addresses by default and therefore can be ambiguous.
I've worked around this locally by hacking llvm to only emit aranges for
text objects, but I was wandering if it's something that's valuable to
fix upstream. My guess is that it's probably too niche to worry about
for the moment, but if there's interest I can propose a design (probably
a target hook to ask if segment selectors are required and how to get
their number from an object).

Thanks for your help

Luke

-- 
Codeplay Software Ltd.
Company registered in England and Wales, number: 04567874
Registered office: Regent House, 316 Beulah Hill, London, SE19 3HF


More information about the llvm-dev mailing list