[llvm-dev] Range lists, zero-length functions, linker gc
David Blaikie via llvm-dev
llvm-dev at lists.llvm.org
Sun May 31 10:30:58 PDT 2020
On Sun, May 31, 2020 at 9:57 AM Fangrui Song <maskray at google.com> wrote:
>
>
> On 2020-05-30, David Blaikie wrote:
> >On Sat, May 30, 2020 at 8:50 PM Fangrui Song <maskray at google.com> wrote:
> >>
> >> On 2020-05-29, David Blaikie wrote:
> >> >On Fri, May 29, 2020 at 2:20 PM Robinson, Paul <paul.robinson at sony.com> wrote:
> >> >> > On Fri, May 29, 2020 at 9:21 AM Robinson, Paul <paul.robinson at sony.com>
> >> >> > wrote:
> >> >> > > > On 2020-05-28, David Blaikie wrote:
> >> >> > > > >On Thu, May 28, 2020 at 2:52 PM Robinson, Paul
> >> >> > <paul.robinson at sony.com>
> >> >> > > > >wrote:
> >> >> > > > >
> >> >> > > > >> As has been mentioned elsewhere, Sony generally fixes up references
> >> >> > > > from
> >> >> > > > >> debug info to stripped functions (of any length) using -1, because
> >> >> > > > that’s a
> >> >> > > > >> less-likely-to-be-real address than 0x0 or 0x1. (0x0 is a typical
> >> >> > base
> >> >> > > > >> address for shared libraries, I’d think using it has the potential
> >> >> > to
> >> >> > > > >> mislead various consumers.) For .debug_ranges we use -2, because
> >> >> > both
> >> >> > > > a
> >> >> > > > >> 0/0 pair and a -1/-1 pair have a reserved meaning in that section.
> >> >> > > > >>
> >> >> > > > >
> >> >> > > > >Any harm in using -2 everywhere, for consistency?
> >> >> > > >
> >> >> > > > When resolving a relocation, in certain cases we have to give an
> >> >> > undefined
> >> >> > > > symbol a value.
> >> >> > > > This can happen with:
> >> >> > > >
> >> >> > > > * an undefined weak symbol
> >> >> > > > * an undefined global symbol in --noinhibit-exec mode (a buggy --gc-
> >> >> > > > sections implementation can trigger this as well)
> >> >> > > > * a relocation referencing an undefined symbol in a non-SHF_ALLOC
> >> >> > section
> >> >> > > >
> >> >> > > > We always respect the addend in a relocation entry for an absolute/PC-
> >> >> > > > relative (I can use "most" here)
> >> >> > > > relocation (R_ARM_THM_PC8, R_AARCH64_ADR_PREL_PG_HI21, R_X86_64_64,
> >> >> > > > local exec TLS relocation types, ...)
> >> >> > > > Ignoring the addend (using -2 everywhere) will break this consistency.
> >> >> > > >
> >> >> > > > The relocated code may do pointer subtraction which would work if
> >> >> > addends
> >> >> > > > were
> >> >> > > > respected, but will break using -2 everywhere.
> >> >> > >
> >> >> > > I suspect David meant "any harm to using -2 in all .debug_* sections?"
> >> >> > > and not literally everywhere. Sony does special cases only for the
> >> >> > > .debug_* sections.
> >> >> >
> >> >> > Right - thanks for the clarification.
> >> >> >
> >> >> > > I've been meaning to propose that DWARF v6 reserve a special address for
> >> >> > > this kind of situation. Whether the committee would be willing to make
> >> >> > > it be -1 or -2 for all targets, or make it target-defined, I don't know.
> >> >> > > (Dreading the inevitable argument over whether addresses are signed or
> >> >> > > unsigned, or more to the point whether they wrap. They've been unsigned
> >> >> > > and wrapping was undefined on the small set of machines I'm familiar
> >> >> > with.)
> >> >> > > Certainly the toolchain community would benefit from making it be the
> >> >> > > same everywhere.
> >> >> > >
> >> >> > > Personally I'd vote for -1, and make pre-v5 .debug_loc/.debug_ranges
> >> >> > > sections be an extra-special case using -2. We can (I hope) standardize
> >> >> > > on -1 for v6 onward, and document -1/-2 on the DWARF wiki as recommended
> >> >> > > practice for prior versions.
> >> >> >
> >> >> > That'd make linking difficult - the unix linkers at least, currently
> >> >> > don't have to identify the DWARF version when linking - having to pass
> >> >> > an extra linking flag or have the linker parse any DWARF (what if an
> >> >> > object file contains more than one CU & the linker has to apply
> >> >> > different relocations in different parts of the object file because of
> >> >> > that?) would be a significant cost/problem, I think.
> >> >> >
> >> >> > Though I like the tidiness of -1 everywhere, that backwards
> >> >> > compatibility with debug_ranges (& debug_loc similarly) is a problem.
> >> >> > Though ld.bfd does special case debug_ranges (& should special case
> >> >> > debug_loc), perhaps that's the solution. -2 for debug_ranges and
> >> >> > debug_loc, -1 everywhere else (which effectively means everywhere in
> >> >> > DWARFv5 onwards)?
> >> >>
> >> >> Exactly. Base it on the section name, .debug_loc and .debug_ranges
> >> >> use -2 and all other .debug_* use -1. No explicit version check needed.
> >> >>
> >> >> In terms of *specification*, DWARF v6 would say to use -1, and the
> >> >> best-practices on the wiki would say "use -1, except for .debug_loc
> >> >> and .debug_ranges use -2."
> >> >
> >> >Sounds pretty good to me.
> >>
> >> Looks good to me, too.
> >>
> >> >
> >> >Ray - how do you feel about that? Do you think that's something lld
> >> >would be able to do?
> >>
> >> This seems fine. If my understanding is correct, for an R_X86_64_64
> >> referencing sym + addend, the relocated value is:
> >>
> >> if is_defined(sym)
> >> return addr(sym) + addend
> >> if relocated_section is .debug_ranges or .debug_loc
> >> return -2 + addend
> >>
> >> // Every DWARF v5 section falls here
> >> return -1 + addend
> >>
> >>
> >> I still want addend to take part in the computation. This makes
> >> subtraction sound and provides a bit more information (the original
> >> length).
> >>
> >> For Alexey's example, if we see [0xfffffffffffffffe, 0x0000000000000004)
> >> in .debug_ranges, we know that the original length was 6. DWARF
> >> consumers should allow [0xfffffffffffffffe, *) (but reject other
> >> (low > high) pairs) and emit no diagnostic (DWARFAddressRange::valid() needs an update).
> >
> >Ah, unfortunately, I don't think it would be OK to leave the addend. I
> >believe it needs to be handled sort of the way bfd ld does it (but -2
> >or -1, instead of 0 or 1) - dropping the addend.
> >
> >Otherwise we'd be at risk of having both the start and end of the
> >function having non-zero addend (if they were in a comdat group with
> >some other code/another function, for instance) and then both would
> >wraparound and be indistinguishable from normal address ranges.
> >
> >eg:
> >
> >__attribute__((section(".text.x"))) void f1() { }
> >__attribute__((section(".text.x"))) void f2() { }
> >int main() { }
> >$ clang++ rng.cpp -fuse-ld=lld -Wl,-gc-sections -g && llvm-dwarfdump a.out
> >DW_TAG_compile_unit
> > DW_AT_ranges (0x00000000
> > [0x0000000000000000, 0x0000000000000016)
> > [0x0000000000400540, 0x0000000000400548))
> >...
> > DW_TAG_subprogram
> > DW_AT_low_pc (0x0000000000000000)
> > DW_AT_high_pc (0x0000000000000006)
> > DW_AT_name ("f1")
> >...
> > DW_TAG_subprogram
> > DW_AT_low_pc (0x0000000000000010)
> > DW_AT_high_pc (0x0000000000000016)
> > DW_AT_name ("f2")
> >...
> >
> >Linking this with gold or lld, leaves the low_pc of 'f2' non-zero (10,
> >in my case). Because the addend is non-zero. That makes it not
> >possible to identify as dead code - even if using -2, since the addend
> >would still wrap it back around to a positive value.
> >
> >Not the /most/ realistic example - there are cases where clang puts
> >multiple functions in the same comdat, though I think they still go in
> >separate sections still - though maybe they don't actually need to be
> >in separate sections, though, since they're comdat'd together.
> >
> >I don't think it's meaningful to talk about the length of a function
> >that doesn't exist/there are no instructions for - so I don't think
> >there's a loss in fidelity to not have that information in the final
> >linked DWARF.
>
> Thanks for the (explicit section) example. There is an R_X86_64_64
> referencing .text.x+16 relocating .debug_ranges (begining address
> offset) and another relocating .debug_info (DW_AT_low_pc)
Yeah, it's specifically the one in DW_AT_low_pc (+10) that's the
problem. The one in .debug_ranges is the end of the range that starts
at .text.x+0, so that particular pair (+0 to +16) would be OK with
-1,+16 - but the low_pc of +10 would be ambiguous/unidentifiable as
dead code.
> It looks like we have to ignore the addend. Amended formula for a
> relocated value in a non-SHF_ALLOC section:
>
> if is_defined(sym)
> return addr(sym) + addend
> if relocated_section is .debug_ranges or .debug_loc
> return -2
>
> // Every DWARF v5 section falls here
> return -1 # Question: why can't we use 0? [1]
Yep, that'd be the theory (assuming changing the value for
non-.debug_* sections from 0+addend to -1 is acceptable).
> >> David> I guess we'd need to probably have a conversation with the DWARF
> >> David> Committee and/or with debugger vendors (gdb and lldb) to ensure they'd
> >> David> be willing to make matching changes to support this...
> >> David>
> >> David> and since they might not support it out of the gate, perhaps we'd need
> >> David> it behind a flag for the current (albeit buggy) backwards
> >> David> compatibility? Or maybe it works well enough without explicit support
> >> David> already.
> >>
> >> Alexey> So after implementing this, some tools could potentially stop working.
> >> Alexey> I do not know, such tools. So, I am not sure whether that is the
> >> Alexey> problem.
> >>
> >> I hope we don't need a linker option (selecting -1+addend or 0+addend).
> >> If unfortunately it can't be avoided, we probably need to discuss it
> >> with binutils. I can do that.
> >>
> >> Another thought. I wonder whether non-debug non-SHF_ALLOC sections
> >> expect addr(undefined) to be 0. If they can live with any value, and if
> >> we don't need the -2 special case (to avoid collision with the base
> >> address selection entry -1),
> >
> >I think we will need that special case due to DWARFv4 for a while yet.
> >Though, yes - if the current addr(undefined) variance (0 for bfd,
> >0+addend for lld/gold) extends beyond .debug_* sections, /maybe/ we
> >could make it configurable with a default, while keeping the hardcoded
> >exception for .debug_ranges and .debug_loc of using -2. Though I'd
> >worry a bit about more open-ended use cases for the non-debug
> >sections. At least for the debug sections we have a pretty good idea
> >of which consumers to go and talk to/test with, etc.
>
> [1] I forgot to ask why we can't resolve addr(unsigned)+addend to 0 in
> .debug_* (excluding .debug_ranges and .debug_loc).
The initial reason was to avoid terminatinating .debug_loc lists and
.debug_range lists early - but that doesn't apply since we are talking
about carving out -2 specially for them anyway.
I believe the other possible reason is for a system that uses 0 as a
valid executable address.
> Paul> As has been mentioned elsewhere, Sony generally fixes up references from
> Paul> debug info to stripped functions (of any length) using -1, because
> Paul> that’s a less-likely-to-be-real address than 0x0 or 0x1. (0x0 is a
> Paul> typical base address for shared libraries, I’d think using it has the
> Paul> potential to mislead various consumers.) For .debug_ranges we use -2,
> Paul> because both a 0/0 pair and a -1/-1 pair have a reserved meaning in that
> Paul> section.
>
> 0 in an ET_DYN object refers to the base address, where the ELF header
> (or headers of other binary formats) resides. The ELF header is unlikely
> to be a meaningful code sequence. In LLD and GNU ld -z separate-code's
> layout, it is mapped to a (non-executable) PF_R PT_LOAD segment.
>
> I suspect special cased 0 values may have been baked into some DWARF
> consumers.
>
> Do other binary formats/platforms allow the base address to be a
> meaningful place that we can't lose DWARF for? (If 0 is a meaningful
> place but its debuggability does not matter that much, we can still use
> 0 as a special value, losing the debuggability of address 0).
Yeah, that's the thing - while it might not be an issue for ELF, DWARF
would want a standard that's fairly resilient to quirky/interesting
use cases (admittedly - such platforms could equally want to make
their executable code way up in the address space near max or max - 1,
etc?).
I know Alexey was particularly interested in the problem of
low-address functions overlapping with the [0, length) ranges created
by dead code with gold/lld - but I'm not sure if he has a use case for
literal zero, or just "low but non-zero". Hopefully he can weigh in
here.
> >> we can add a more generic option
> >>
> >> -z undef-address-in-nonalloc=-1
> >>
> >> Applying to every relocation referencing an undefined symbol in a
> >> non-SHF_ALLOC section. Let -1 be the default value to make .debug_*
> >> (excluding .debug_ranges and .debug_loc) happy.
> >
> >If we end up blessing it as part of the DWARF spec, we probably
> >wouldn't want it to be user-configurable for the .debug_ sections, so
> >I'd hesitate to add that configurability to the linker lest we have to
> >revoke it to conform to DWARF (breaking flag compatibility with
> >previous versions of the linker, etc). Admittedly we'll be breaking
> >output compatibility with this change regardless, so potentially
> >having the flag as an escape hatch could be useful.
>
> For these open-ended non-debug non-SHF_ALLOC sections, some use section
> groups/SHF_LINK_ORDER to solve the problem. I can't think of an example
> of the rest. The flag is probably not going to be useful.
>
> We need to ask binutils whether they can think of some examples the
> output compatibility matters.
*nod* wouldn't hurt to know/get some feedback on.
>
> >
> >> Link a program twice, with -z undef-sym=-1 and -z undef-sym=-2,
> >> respectively. A binary diff can reveal relocation entries referencing
> >> undefined symbols.
> >>
> >> >
> >> >> --paulr
> >> >>
> >> >> >
> >> >> > >
> >> >> > >
> >> >> > > >
> >> >> > > > The relocated code can be allocatable or not. Non-allocatable non-
> >> >> > debug
> >> >> > > > code can have meaningful pointer subtraction as well. This is why I am
> >> >> > > > not too fond of (using a fixed value everywhere).
> >> >> > > >
> >> >> > > > >(also, I had a silly idea, but what would happen if we added a CU
> >> >> > > > attribute
> >> >> > > > >with an address value that was a reference to a weak always-unused
> >> >> > > > symbol,
> >> >> > > > >that way the linker would fix it up with whatever its preferred magic
> >> >> > > > value
> >> >> > > > >was, and the consumer would then know what the magic value was that
> >> >> > > > >represented dead code? (though this would only work if the value were
> >> >> > > > used
> >> >> > > > >consistently everywhere - which is zero for gold/lld (well, almost...
> >> >> > you
> >> >> > > > >can still create situations where a non-zero value is used even for a
> >> >> > > > >low_pc), but wouldn't work for binutils ld (1 in debug_ranges, 0
> >> >> > > > elsewhere)
> >> >> > > > >or Sony (-2 in debug_ranges, -1 elsewhere)... - so, wouldn't actually
> >> >> > > > work
> >> >> > > > >for any producer currently, so maybe there's little value in that as
> >> >> > a
> >> >> > > > >feature))
> >> >> > > >
> >> >> > > > For a non-SHF_ALLOC section, LLD currently considers it a GC root if
> >> >> > all
> >> >> > > > the conditions below are satisfied:
> >> >> > > >
> >> >> > > > * not SHT_REL[A]
> >> >> > > > * not SHF_LINK_ORDER
> >> >> > > > * not in a section group
> >> >> > > >
> >> >> > > > (I managed to lobby the ideas to GNU ld. GNU ld from binutils 2.35
> >> >> > > > onwards will have mostly compatible semantics with LLD)
> >> >> > > >
> >> >> > > > There is a cost fragmenting a .debug_* section: sizeof(Elf64_Shdr)=64
> >> >> > ->
> >> >> > > > each section takes 64 bytes in the section header table.
> >> >> >
> >> >> > Soryr, I missed a step here - you're talking about the cost to
> >> >> > fragmenting .debug_* sections as an alternative to choosing a special
> >> >> > address value to resolve for dead code? (by removing the DWARF that
> >> >> > refers to the dead code, uinstead of keeping it and having to write a
> >> >> > special address value into it?)
> >> >> >
> >> >> > Unfortunately, no matter the cost - that solution doesn't apply to
> >> >> > Split DWARF. Maybe at some point we'll want to have some output from
> >> >> > the linker that lists the dead/live code, and use that for building a
> >> >> > dwp (like dsymutil) but I don't think we can predicate correctness on
> >> >> > such a thing - in part because we'd still want to be able to read the
> >> >> > .dwo files without post-processing for more interactive/iterative
> >> >> > development scenarios. So we'd still need a special address value to
> >> >> > write into debug_addr when using Split DWARF, and I think it's
> >> >> > important to allow the non-split case to look like the split case
> >> >> > where it doesn't /have/ to diverge - if divergence provides benefits,
> >> >> > that's nice, but I don't think it'd be good to make that divergence
> >> >> > /necessary/.
> >> >> >
> >> >> > > > SHF_LINK_ORDER
> >> >> > > > has semantics of a lightweight section group. Assume we don't want to
> >> >> > > > have one .debug_* for each function section, this .debug_* will be a
> >> >> > GC
> >> >> > > > root. Relocations from it (even if the symbol is weak) will retain the
> >> >> > > > sections defining the symbols.
> >> >> > >
> >> >> > > We did some quick research into per-function .debug_info fragments a
> >> >> > > while back, putting the subprogram info into the same section group as
> >> >> > > the function; it was not an unqualified win. The very large number of
> >> >> > > sections costs processing time, and cross-section references added to
> >> >> > > the relocation count (I believe these can generally be resolved by MC
> >> >> > > in a non-fragmented .debug_info section).
> >> >> >
> >> >> > Yeah, you either pay more relocations (or size cost to use signatures
> >> >> > instead) and/or more size to duplicate some DIEs (like type units
> >> >> > currently duplicate fundamental/non-user-defined types into the type
> >> >> > unit).
> >> >> >
> >> >> > (& it's a non-starter for Split DWARF anyway)
> >> >> >
> >> >> > > James Henderson might have
> >> >> > > the actual results stashed somewhere.
> >> >> > >
> >> >> > > That approach *might* still be faster than post-processing a unified
> >> >> > > section, which IIUC is what D59553 does.
> >> >> > >
> >> >> > > >
> >> >> > > > So, this trick can't work without refining the --gc-sections rules
> >> >> > > > further.
> >> >> > >
> >> >> > > If I understand the objection, yeah, we can't have .debug_* sections
> >> >> > > being gc roots.
> >> >> >
> >> >> > I'm not sure I follow, here - "this trick" being "splitting debug info
> >> >> > into droppable chunks for each function/subprogram and putting those
> >> >> > chunks in the same comdat group as the function code itself" - that
> >> >> > trick wouldn't work because the current rules would move that debug
> >> >> > info from (where it currently is in one big debug_info section) a
> >> >> > non-gc root to (where it would go - becoming part of a comdat group) a
> >> >> > gc-root? OK, right. Agreed that's another reason (apart from the Split
> >> >> > DWARF one, and the size/reloc tradeoff ones) that would be
> >> >> > problematic.
> >> >> >
> >> >> > - Dave
> >> >> >
> >> >> > >
> >> >> > > --paulr
> >> >> > >
> >> >> > > >
> >> >> > > > >
> >> >> > > > >> If you’re looking only at zero-length functions, you can stop
> >> >> > there;
> >> >> > > > but
> >> >> > > > >> I’m not sure why stopping there solves much of a real problem, as
> >> >> > > > >> zero-length functions seem like a weird corner case.
> >> >> > > > >>
> >> >> > > > >
> >> >> > > > >They're the case that breaks existing usage by terminating the range
> >> >> > list
> >> >> > > > >early - the other existing usage seems to be fine with "resolve to
> >> >> > > > addend"
> >> >> > > > >strategy that lld and gold use - in that it moves most
> >> >> > dead/deduplicated
> >> >> > > > >functions outside the executable range and so consumers never come
> >> >> > asking
> >> >> > > > >for "what code is at instruction 5" because they're never executing
> >> >> > code
> >> >> > > > at
> >> >> > > > >a pc of 5. But, yes, this existing solution doesn't work once you
> >> >> > have
> >> >> > > > code
> >> >> > > > >mapped into low address spaces or have utterly massive functions that
> >> >> > > > might
> >> >> > > > >have a length that would reach into the executable address space even
> >> >> > > > when
> >> >> > > > >their start is remapped to zero.
> >> >> > > >
> >> >> > > > For posterity, David gave me an example offline: void f1() { } void
> >> >> > f2() {
> >> >> > > > } int main() { f1(); }
> >> >> > > >
> >> >> > > > clang -fuse-ld=bfd -ffunction-sections -Wl,--gc-sections -g a.c -o
> >> >> > a.bfd
> >> >> > > > llvm-dwarfdump -debug-ranges a.bfd
> >> >> > > > =>
> >> >> > > > R_X86_64_64 relocations in .debug_ranges are resolved to 1, ignoring
> >> >> > the
> >> >> > > > addend
> >> >> > > >
> >> >> > > > (Behavior introduced in
> >> >> > > > https://urldefense.com/v3/__https://sourceware.org/git/?p=binutils-
> >> >> > __;!!JmoZiZGBv3RvKRSx!tvZHOB67bddZqG6U17Cj0X1rfKGW72rrjmKKFQNBxBlQvgdukM5e
> >> >> > 3DkTDxuwRVwsyw$
> >> >> > > >
> >> >> > gdb.git;a=blobdiff;f=bfd*ChangeLog;h=8fbaed21fa2c8238459acb637545583f3cfbb
> >> >> > > >
> >> >> > fdf;hp=18a3a67be3a5980998c4461b5a739e54f3551b17;hb=e4067dbb2a3368dbf908b39
> >> >> > > > c5435c84d51abc9f3;hpb=c0621d88b096cc046adf6ed484baea9ba5bfe721)
> >> >> > > >
> >> >> > > > The comments below are also insightful. I need to ponder more (and
> >> >> > need
> >> >> > > > to read the DWARF v4 and v5 specs more as I am not so familiar these
> >> >> > > > DWARF constructs). But it is too late now. Will probably comment
> >> >> > > > another day :)
> >> >> > > >
> >> >> > > > >
> >> >> > > > >> Linkers know how to strip dead functions (gc) or deduplicate them
> >> >> > (icf,
> >> >> > > > >> COMDAT) and people do this all the time, in some cases (COMDAT)
> >> >> > without
> >> >> > > > >> explicitly asking for it, so non-zero-length functions seem like
> >> >> > the
> >> >> > > > much
> >> >> > > > >> more interesting case. In that situation, -1 (or -2) seems like a
> >> >> > much
> >> >> > > > >> wiser choice of blessed-as-not-real address, versus 0x0 or 0x1.
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >> Stripping non-zero-length functions does mean you have to care
> >> >> > about
> >> >> > > > more
> >> >> > > > >> sections. For example .debug_locs would want to be fixed up the
> >> >> > same
> >> >> > > > way
> >> >> > > > >> as .debug_ranges, not because a debugger would care but so that
> >> >> > dumpers
> >> >> > > > >> would not run into the 0/0 brick wall.
> >> >> > > > >>
> >> >> > > > >
> >> >> > > > >Yep - in theory a consumer could actually use a loclist across
> >> >> > multiple
> >> >> > > > >sections (if a global variable got hoisted into a register for a
> >> >> > function
> >> >> > > > >for instance), but I don't know of any producers doing this today -
> >> >> > until
> >> >> > > > >then, yeah, it's just a dumping problem and ld.bfd does produce DWARF
> >> >> > > > that
> >> >> > > > >has that problem (because it resolves both relocations to dead code
> >> >> > > > >(begin/end of a range) to zero in all sections except debug_ranges,
> >> >> > so
> >> >> > > > >terminates the loclist list early) - binutils objdump avoids dumping
> >> >> > the
> >> >> > > > >following corrupted fragment by only dumping hunks of debug_loc
> >> >> > starting
> >> >> > > > at
> >> >> > > > >places referenced from debug_info. Without debug_info it won't dump
> >> >> > > > >anything from debug_loc - and if the references from debug_info,
> >> >> > parsed
> >> >> > > > >until the 0,0 terminator don't cover the whole debug_loc section, it
> >> >> > > > prints
> >> >> > > > >messages saying there are "gaps".
> >> >> > > > >
> >> >> > > > >Agreed that you'd want debug_loc to have the same special handling as
> >> >> > > > >debug_ranges if it has special handling. Though ideally we'd pick a
> >> >> > value
> >> >> > > > >that works equally everywhere? (-2, by the sounds of it)
> >> >> > > > >
> >> >> > > > >
> >> >> > > > >> We also fix up lengths in .debug_aranges to zero, although there
> >> >> > might
> >> >> > > > be
> >> >> > > > >> history behind that tactic that I’m not aware of; it seems like it
> >> >> > > > ought to
> >> >> > > > >> be unnecessary, if consumers are aware of the special address(es).
> >> >> > > > >>
> >> >> > > > >
> >> >> > > > >Yeah, no idea about debug_aranges... I'd have thought it'd be fine
> >> >> > with
> >> >> > > > the
> >> >> > > > >same approach as debug_ranges, but I haven't looked at debug_aranges
> >> >> > in a
> >> >> > > > >long time.
> >> >> > > > >
> >> >> > > > >I guess the only remaining question is: Since it's possible to have
> >> >> > code
> >> >> > > > on
> >> >> > > > >some systems down at address zero, or close enough to it that [0,
> >> >> > length)
> >> >> > > > >might overlap with real exxecutable code addresses - does anyone know
> >> >> > of
> >> >> > > > >the inverse: where code is mapped up near uint32 max? Such that that
> >> >> > > > usage
> >> >> > > > >wouldn't be able to sacrifice uint32 max - 1 to use as a blessed
> >> >> > value
> >> >> > > > here?
> >> >> > > > >
> >> >> > > > >- Dave
> >> >> > > > >
> >> >> > > > >
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >> --paulr
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >> *From:* Alexey Lapshin <alapshin at accesssoftek.com>
> >> >> > > > >> *Sent:* Thursday, May 28, 2020 9:03 AM
> >> >> > > > >> *To:* Sriraman Tallam <tmsriram at google.com>; Wei Mi
> >> >> > <wmi at google.com>;
> >> >> > > > >> Robinson, Paul <paul.robinson at sony.com>; Adrian Prantl
> >> >> > > > <aprantl at apple.com>;
> >> >> > > > >> Jonas Devlieghere <jdevlieghere at apple.com>; Alexey Lapshin <
> >> >> > > > >> a.v.lapshin at mail.ru>; Eric Christopher <echristo at gmail.com>;
> >> >> > Fangrui
> >> >> > > > Song
> >> >> > > > >> <maskray at google.com>; David Blaikie <dblaikie at gmail.com>;
> >> >> > > > >> llvm-dev at lists.llvm.org
> >> >> > > > >> *Subject:* Re: [llvm-dev] Range lists, zero-length functions,
> >> >> > linker gc
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >> Hi David,
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >> >So there have been several recent discussions about the issues
> >> >> > around
> >> >> > > > >>
> >> >> > > > >> >DWARF-agnostic linking and gc-sections, linkonce function
> >> >> > definitions
> >> >> > > > >> being
> >> >> > > > >>
> >> >> > > > >> >dropped, etc - and just how much DWARF-awareness would be suitable
> >> >> > > > >>
> >> >> > > > >> >in a linker to help with this situation.
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >> > I'd like to discuss a narrower instance of this issue: Zero
> >> >> > length
> >> >> > > > >> gc'd/deduplicated functions.
> >> >> > > > >>
> >> >> > > > >> > LLVM seems to at least produce zero length functions in a few
> >> >> > cases:
> >> >> > > > >> > * non-void function without a return statement
> >> >> > > > >> > * function definition containing only llvm_unreachable
> >> >> > > > >> > (both of these trap at -O0, but at higher optimization levels
> >> >> > even
> >> >> > > > the
> >> >> > > > >> trap
> >> >> > > > >>
> >> >> > > > >> > instruction is removed & you get the full power UB of control
> >> >> > > > >> flowing off
> >> >> > > > >>
> >> >> > > > >> > the end of the function into whatever other bytes are after that
> >> >> > > > >> function)
> >> >> > > > >>
> >> >> > > > >> > So, for context, debug_ranges (this whole issue doesn't exist in
> >> >> > > > >> DWARFv5,
> >> >> > > > >>
> >> >> > > > >> > FWIW) is a list of address pairs, terminated by a pair of zeros.
> >> >> > > > >>
> >> >> > > > >> > With function sections, or even just with normal C++ inline
> >> >> > > > functions,
> >> >> > > > >>
> >> >> > > > >> > the CU will have a range entry for that function that consists of
> >> >> > two
> >> >> > > > >> relocations
> >> >> > > > >>
> >> >> > > > >> > - to the start and end of the function. Generally the start of
> >> >> > the
> >> >> > > > >> function is the
> >> >> > > > >>
> >> >> > > > >> > start of the section, and the end is "start of function + length
> >> >> > of
> >> >> > > > >> function (aka addend)".
> >> >> > > > >>
> >> >> > > > >> > Usually any relocation to the section would keep that section
> >> >> > > > "alive"
> >> >> > > > >> during linking -
> >> >> > > > >>
> >> >> > > > >> > but that would cause debug info to defeat linker GC and
> >> >> > > > deduplication.
> >> >> > > > >> So there's
> >> >> > > > >>
> >> >> > > > >> > special rules for how linkers handle these relocations in debug
> >> >> > info
> >> >> > > > to
> >> >> > > > >> allow the
> >> >> > > > >>
> >> >> > > > >> > sections to be dropped - what do you write in the bytes that
> >> >> > > > requested
> >> >> > > > >> the relocation?
> >> >> > > > >>
> >> >> > > > >> > Binutils ld: Special cases only debug_ranges, resolving all
> >> >> > > > relocations
> >> >> > > > >> to dead
> >> >> > > > >>
> >> >> > > > >> > code to 1. In other debug sections, these values are all resolved
> >> >> > to
> >> >> > > > >> zero.
> >> >> > > > >>
> >> >> > > > >> > Gold and lld: Special cases all debug info sections - resolving
> >> >> > all
> >> >> > > > >> relocations
> >> >> > > > >>
> >> >> > > > >> > to "addend" (so begin usually goes to zero, end goes to "size of
> >> >> > > > >> function")
> >> >> > > > >>
> >> >> > > > >> > These special rules are designed to ensure
> >> >> > omitted/gc'd/deduplicated
> >> >> > > > >> functions
> >> >> > > > >>
> >> >> > > > >> > don't cause the range list to terminate prematurely (which would
> >> >> > > > happen
> >> >> > > > >> if begin/end
> >> >> > > > >>
> >> >> > > > >> > were both resolved to zero).
> >> >> > > > >>
> >> >> > > > >> >But with an empty function, gold and lld's strategy here fails to
> >> >> > > > avoid
> >> >> > > > >> terminating a
> >> >> > > > >>
> >> >> > > > >> >range list by accident.
> >> >> > > > >>
> >> >> > > > >> > What should we do about it?
> >> >> > > > >>
> >> >> > > > >> > 1) Ensure no zero-length functions exist? (doesn't address
> >> >> > backwards
> >> >> > > > >>
> >> >> > > > >> > compatibility/existing functions/other compilers)
> >> >> > > > >> > 2) adopt the binutils approach to this (at least in debug_ranges
> >> >> > -
> >> >> > > > maybe
> >> >> > > > >> in all
> >> >> > > > >>
> >> >> > > > >> > debug sections? (doing it in other sections could break )
> >> >> > > > >> > 3) Revisit the discussion about using an even more 'blessed'
> >> >> > value,
> >> >> > > > >>
> >> >> > > > >> > like int max-1? (
> >> >> > > >
> >> >> > https://urldefense.com/v3/__https://reviews.llvm.org/D59553__;!!JmoZiZGBv3
> >> >> > > > RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL-
> >> >> > 1Zdb4OydT2xVhpDTRyUixtaozLYiewZqMLtoA$
> >> >> > > > >>
> >> >> > > >
> >> >> > <https://urldefense.com/v3/__https:/reviews.llvm.org/D59553__;!!JmoZiZGBv3
> >> >> > > >
> >> >> > RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY63HSqgMOg$
> >> >> > > > >
> >> >> > > > >> )
> >> >> > > > >>
> >> >> > > > >> > (I don't have links to all the recent threads about this
> >> >> > discussion
> >> >> > > > - I
> >> >> > > > >> think D59553
> >> >> > > > >>
> >> >> > > > >> > might've spawned a separate broader discussion/non-review - oh,
> >> >> > > > Alexey
> >> >> > > > >> wrote a
> >> >> > > > >>
> >> >> > > > >> > good summary with links to other discussions here:
> >> >> > > > >>
> >> >> > > > >> >
> >> >> > https://urldefense.com/v3/__http://lists.llvm.org/pipermail/llvm-
> >> >> > > > dev/2019-
> >> >> > September/135068.html__;!!JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ-
> >> >> > > > QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiey_aMV0lQ$
> >> >> > > > >> <https://urldefense.com/v3/__http:/lists.llvm.org/pipermail/llvm-
> >> >> > > > dev/2019-
> >> >> > > >
> >> >> > September/135068.html__;!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE
> >> >> > > > 70_57b4_rsj1TN0qB8NpBvVKtY638NIRu2g$>
> >> >> > > > >> )
> >> >> > > > >>
> >> >> > > > >> > Thoughts?
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >> I think for the problem of "zero length functions and
> >> >> > .debug_ranges"
> >> >> > > > >> binutils approach looks good:
> >> >> > > > >>
> >> >> > > > >> >Special cases only debug_ranges, resolving all relocations to
> >> >> > > > >> >dead code to 1. In other debug sections, these values are all
> >> >> > resolved
> >> >> > > > to
> >> >> > > > >> >zero.
> >> >> > > > >>
> >> >> > > > >> But, this would not completely solve the problem from
> >> >> > > > >>
> >> >> > > >
> >> >> > https://urldefense.com/v3/__https://reviews.llvm.org/D59553__;!!JmoZiZGBv3
> >> >> > > > RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL-
> >> >> > 1Zdb4OydT2xVhpDTRyUixtaozLYiewZqMLtoA$
> >> >> > > > >>
> >> >> > > >
> >> >> > <https://urldefense.com/v3/__https:/reviews.llvm.org/D59553__;!!JmoZiZGBv3
> >> >> > > >
> >> >> > RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY63HSqgMOg$
> >> >> > > > >
> >> >> > > > >> - Overlapped address ranges. Binutils approach will solve the
> >> >> > problem
> >> >> > > > if
> >> >> > > > >> the address range specified as start_address:end_address. While
> >> >> > > > resolving
> >> >> > > > >> relocations, it would replace such a range with 1:1.
> >> >> > > > >> However, It would not work if address ranges were specified as
> >> >> > > > >> start_address:length since the length is not relocated. This case
> >> >> > could
> >> >> > > > be
> >> >> > > > >> additionally fixed by fast scan debug_info for High_PC defined as
> >> >> > > > length
> >> >> > > > >> and changing it to 1. Something which you suggested here:
> >> >> > > > >> https://urldefense.com/v3/__http://lists.llvm.org/pipermail/llvm-
> >> >> > > > dev/2020-May/141599.html__;!!JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ-
> >> >> > > > QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiexb8NU_Fw$
> >> >> > > > >> <https://urldefense.com/v3/__http:/lists.llvm.org/pipermail/llvm-
> >> >> > > > dev/2020-
> >> >> > > >
> >> >> > May/141599.html__;!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b
> >> >> > > > 4_rsj1TN0qB8NpBvVKtY63PsubKJQ$>
> >> >> > > > >> .
> >> >> > > > >>
> >> >> > > > >> So it looks like following solution could fix both problems and be
> >> >> > > > >> relatively fast:
> >> >> > > > >>
> >> >> > > > >> "Resolve all relocations from debug sections into dead code to 1.
> >> >> > Parse
> >> >> > > > >> debug sections and replace HighPc of an address range pointing to
> >> >> > dead
> >> >> > > > code
> >> >> > > > >> and specified as length to 1".
> >> >> > > > >>
> >> >> > > > >> As the result all address ranges pointing into dead code would be
> >> >> > > > marked
> >> >> > > > >> as zero length.
> >> >> > > > >>
> >> >> > > > >> There still exist another problem:
> >> >> > > > >>
> >> >> > > > >> DWARF4: "A range list entry (but not a base address selection or
> >> >> > end of
> >> >> > > > >> list entry) whose beginning and
> >> >> > > > >> ending addresses are equal has no effect because the size of the
> >> >> > range
> >> >> > > > >> covered by such an
> >> >> > > > >> entry is zero."
> >> >> > > > >>
> >> >> > > > >> DWARF5: "A bounded range entry whose beginning and ending address
> >> >> > > > offsets
> >> >> > > > >> are equal
> >> >> > > > >> (including zero) indicates an empty range and may be ignored."
> >> >> > > > >>
> >> >> > > > >> These rules allow us to ignore zero-length address ranges. I.e.,
> >> >> > some
> >> >> > > > tool
> >> >> > > > >> reading DWARF is permitted to ignore related DWARF entries. In that
> >> >> > > > case,
> >> >> > > > >> there could be ignored essential descriptions. That problem could
> >> >> > > > happen
> >> >> > > > >> with -flto=thin example
> >> >> > > >
> >> >> > https://urldefense.com/v3/__https://reviews.llvm.org/D54747*1503720__;Iw!!
> >> >> > > > JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL-
> >> >> > > > 1Zdb4OydT2xVhpDTRyUixtaozLYiezSujGHwQ$
> >> >> > > > >>
> >> >> > > >
> >> >> > <https://urldefense.com/v3/__https:/reviews.llvm.org/D54747*1503720__;Iw!!
> >> >> > > >
> >> >> > JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY
> >> >> > > > 637ju_eQw$>
> >> >> > > > >> . In this example, all type definitions except one were replaced
> >> >> > with
> >> >> > > > >> declarations by thinlto. The definition, which was left, is in a
> >> >> > piece
> >> >> > > > of
> >> >> > > > >> debug info related to deleted code. According to zero-length rule,
> >> >> > that
> >> >> > > > >> definition could be ignored, and finally, incomplete debug info
> >> >> > could
> >> >> > > > be
> >> >> > > > >> used.
> >> >> > > > >>
> >> >> > > > >> So, it probably should be forbidden to generate debug_info, which
> >> >> > could
> >> >> > > > >> become incomplete after removing pieces related to zero length
> >> >> > address
> >> >> > > > >> ranges. Otherwise, creating zero-length address ranges could lead
> >> >> > to
> >> >> > > > >> incomplete debug info.
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >> Thank you, Alexey.
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >>
More information about the llvm-dev
mailing list