[llvm-dev] Range lists, zero-length functions, linker gc

Robinson, Paul via llvm-dev llvm-dev at lists.llvm.org
Fri May 29 12:52:34 PDT 2020



> -----Original Message-----
> From: Alexey Lapshin <alapshin at accesssoftek.com>
> Sent: Friday, May 29, 2020 3:09 PM
> To: Robinson, Paul <paul.robinson at sony.com>; Fangrui Song
> <maskray at google.com>; David Blaikie <dblaikie at gmail.com>
> Cc: Sriraman Tallam <tmsriram at google.com>; Wei Mi <wmi at google.com>; Adrian
> Prantl <aprantl at apple.com>; Jonas Devlieghere <jdevlieghere at apple.com>;
> Alexey Lapshin <a.v.lapshin at mail.ru>; Eric Christopher
> <echristo at gmail.com>; peter.smith at arm.com; George Rimar
> <grimar at accesssoftek.com>; llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] Range lists, zero-length functions, linker gc
> 
> 
> > Subject: Re: [llvm-dev] Range lists, zero-length functions, linker gc
> >
> > On 2020-05-28, David Blaikie wrote:
> > >On Thu, May 28, 2020 at 2:52 PM Robinson, Paul <paul.robinson at sony.com>
> > >wrote:
> > >
> > >> As has been mentioned elsewhere, Sony generally fixes up references
> > from
> > >> debug info to stripped functions (of any length) using -1, because
> > that's a
> > >> less-likely-to-be-real address than 0x0 or 0x1.  (0x0 is a typical
> base
> > >> address for shared libraries, I'd think using it has the potential to
> > >> mislead various consumers.)  For .debug_ranges we use -2, because
> both
> > a
> > >> 0/0 pair and a -1/-1 pair have a reserved meaning in that section.
> > >>
> > >
> > >Any harm in using -2 everywhere, for consistency?
> >
> > When resolving a relocation, in certain cases we have to give an
> undefined
> > symbol a value.
> > This can happen with:
> >
> > * an undefined weak symbol
> > * an undefined global symbol in --noinhibit-exec mode (a buggy --gc-
> > sections implementation can trigger this as well)
> > * a relocation referencing an undefined symbol in a non-SHF_ALLOC
> section
> >
> > We always respect the addend in a relocation entry for an absolute/PC-
> > relative (I can use "most" here)
> > relocation (R_ARM_THM_PC8, R_AARCH64_ADR_PREL_PG_HI21, R_X86_64_64,
> > local exec TLS relocation types, ...)
> > Ignoring the addend (using -2 everywhere) will break this consistency.
> >
> > The relocated code may do pointer subtraction which would work if
> addends
> > were
> > respected, but will break using -2 everywhere.
> 
> >I suspect David meant "any harm to using -2 in all .debug_* sections?"
> >and not literally everywhere.  Sony does special cases only for the
> >.debug_* sections.
> 
> >I've been meaning to propose that DWARF v6 reserve a special address for
> >this kind of situation.  Whether the committee would be willing to make
> >it be -1 or -2 for all targets, or make it target-defined, I don't know.
> 
> >(Dreading the inevitable argument over whether addresses are signed or
> >unsigned, or more to the point whether they wrap.  They've been unsigned
> >and wrapping was undefined on the small set of machines I'm familiar
> with.)
> >Certainly the toolchain community would benefit from making it be the
> >same everywhere.
> 
> >Personally I'd vote for -1, and make pre-v5 .debug_loc/.debug_ranges
> >sections be an extra-special case using -2.  We can (I hope) standardize
> >on -1 for v6 onward, and document -1/-2 on the DWARF wiki as recommended
> >practice for prior versions.
> 
> Would it make sense to use "LowPC > HighPC" in DWARF documentation as a
> sign for that
> case, instead of -1 or -2 ?
> 
> Or more correct: To indicate that address range points into deleted code
> there should be used either zero length, either LowPC>HighPc range ?
> 
> zero length address range is already defined in DWARF documentation.
> LowPC>HighPc is currently not described. It could be documented and
> used as general representation instead of concrete special value.
> 
> Implementation could still use -2 for resolving relocations and it would
> satisfy above definition.
> 
> Thank you, Alexey.

For addresses that are part of a range, that sounds reasonable.

Addresses are not always part of a range, however. I can think of
two cases where they do not:  DW_TAG_label points to a single
instruction, not a range; and the .debug_line section doesn't really 
identify ranges, at least not directly.  I still think we'd want to 
specify a reserved value.

Thanks,
--paulr



More information about the llvm-dev mailing list