<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Mar 12, 2020 at 11:00 AM Luke Drummond <<a href="mailto:luke.drummond@codeplay.com">luke.drummond@codeplay.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Thu Mar 12, 2020 at 5:37 PM, David Blaikie wrote:<br>
> On Wed, Mar 11, 2020 at 8:09 AM Luke Drummond<br>
> <<a href="mailto:luke.drummond@codeplay.com" target="_blank">luke.drummond@codeplay.com</a>><br>
> wrote:<br>
><br>
> > On Tue Mar 10, 2020 at 7:45 PM, David Blaikie wrote:<br>
> > > If you only want code addresses, why not use the CU's<br>
> > > low_pc/high_pc/ranges<br>
> > > - those are guaranteed to be only code addresses, I think?<br>
> > ><br>
> > In the common case, for most targets LLVM supports I think you're right,<br>
> > but for my case, regrettably, not. Because my target is a Harvard<br>
> > Architecture, any code address can have the same ordinal value as any<br>
> > data address: the code and data reside on different buses so the whole<br>
> > 4GiB space is available to both code, and data. `DW_AT_low_pc` and<br>
> > `DW_AT_high_pc` can be used to find the range of the code segment, but<br>
> > given an arbitrary address, cannot be used to conclusively determine<br>
> > whether that address belongs to code or data when both segments contain<br>
> > addresses in that numeric range.<br>
><br>
><br>
> Sorry I'm not following, partly probably due to my not having worked<br>
> with<br>
> such machines before.<br>
><br>
> But how are the code addresses and data addresses differentiated then<br>
> (eg:<br>
> if you had segment selectors in debug_aranges, how would they be used?<br>
> The<br>
> addresses taken from the system at runtime have some kind of segment<br>
> selector associated with them, that you can then use to match with the<br>
> addr+segment selector in aranges?).<br>
Yes. This. The system mostly provides us the ability to disambiguate<br>
addresses because the device's simulator / debugger make this<br>
unambiguous, but the current .debug_aranges does not allow us to do this<br>
because it's missing such info.<br>
><br>
> Actually, coming at it from a different angle: It sounds like in the<br>
> original email you're suggesting if debug_aranges did not contain data<br>
> addresses, this would be good/sufficient for you? So somehow you'd be<br>
> ensuring you only query debug_aranges using things you know are code<br>
> addresses, not data addresses? So why would the same solution/approach<br>
> not<br>
> hold to querying low/high/ranges on a CU that's already guaranteed not<br>
> to<br>
> contain data addresses?<br>
That's the root of the issue: the .debug_aranges section emitted by llvm<br>
*does* contain data addresses by default and therefore can be ambiguous.<br>
I've worked around this locally by hacking llvm to only emit aranges for<br>
text objects, </blockquote><div><br>Sorry, but I'm still not understanding why "aranges for only text objects" is more usable for your use case than "high/low/ranges on the CU"? Could you help me understand how those are different in your situation?<br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">but I was wandering if it's something that's valuable to<br>
fix upstream. My guess is that it's probably too niche to worry about<br>
for the moment, but if there's interest I can propose a design (probably<br>
a target hook to ask if segment selectors are required and how to get<br>
their number from an object).<br></blockquote><div><br>Added a few debug info folks in case they've got opinions. I don't really mind if we removed data objects from debug_aranges, though as you say, it's arguably correct/maybe useful as-is. Supporting it properly - probably using address segment selectors would be fine too, I guess AVR uses address spaces for its pointers to differentiate data and code addresses? In which case we could encode the LLVM address space as the segment selector (& probably would need to query the target to decide if it has non-zero address spaces and use that to decide whether to use segment selectors in debug_aranges)<br><br>But in general, I'm mostly just discouraging people from using aranges - the data is duplicated in the CU's ranges anyway (there's some small caveats there - a producer doesn't /have/ to produce ranges on the CU, but I'd just say lower performance on such DWARF would be acceptable) & makes object files/executables larger for minimal value/mostly duplicate data.<br><br>- Dave<br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Thanks for your help<br>
<br>
Luke<br>
<br>
-- <br>
Codeplay Software Ltd.<br>
Company registered in England and Wales, number: 04567874<br>
Registered office: Regent House, 316 Beulah Hill, London, SE19 3HF<br>
</blockquote></div></div>