[llvm-dev] Range lists, zero-length functions, linker gc

Fangrui Song via llvm-dev llvm-dev at lists.llvm.org
Sun May 31 12:28:06 PDT 2020


On 2020-05-31, David Blaikie wrote:
>On Sun, May 31, 2020 at 9:57 AM Fangrui Song <maskray at google.com> wrote:
>>
>>
>> On 2020-05-30, David Blaikie wrote:
>> >On Sat, May 30, 2020 at 8:50 PM Fangrui Song <maskray at google.com> wrote:
>> >>
>> >> On 2020-05-29, David Blaikie wrote:
>> >> >On Fri, May 29, 2020 at 2:20 PM Robinson, Paul <paul.robinson at sony.com> wrote:
>> >> >> > On Fri, May 29, 2020 at 9:21 AM Robinson, Paul <paul.robinson at sony.com>
>> >> >> > wrote:
>> >> >> > > > On 2020-05-28, David Blaikie wrote:
>> >> >> > > > >On Thu, May 28, 2020 at 2:52 PM Robinson, Paul
>> >> >> > <paul.robinson at sony.com>
>> >> >> > > > >wrote:
>> >> >> > > > >
>> >> >> > > > >> As has been mentioned elsewhere, Sony generally fixes up references
>> >> >> > > > from
>> >> >> > > > >> debug info to stripped functions (of any length) using -1, because
>> >> >> > > > that’s a
>> >> >> > > > >> less-likely-to-be-real address than 0x0 or 0x1.  (0x0 is a typical
>> >> >> > base
>> >> >> > > > >> address for shared libraries, I’d think using it has the potential
>> >> >> > to
>> >> >> > > > >> mislead various consumers.)  For .debug_ranges we use -2, because
>> >> >> > both
>> >> >> > > > a
>> >> >> > > > >> 0/0 pair and a -1/-1 pair have a reserved meaning in that section.
>> >> >> > > > >>
>> >> >> > > > >
>> >> >> > > > >Any harm in using -2 everywhere, for consistency?
>> >> >> > > >
>> >> >> > > > When resolving a relocation, in certain cases we have to give an
>> >> >> > undefined
>> >> >> > > > symbol a value.
>> >> >> > > > This can happen with:
>> >> >> > > >
>> >> >> > > > * an undefined weak symbol
>> >> >> > > > * an undefined global symbol in --noinhibit-exec mode (a buggy --gc-
>> >> >> > > > sections implementation can trigger this as well)
>> >> >> > > > * a relocation referencing an undefined symbol in a non-SHF_ALLOC
>> >> >> > section
>> >> >> > > >
>> >> >> > > > We always respect the addend in a relocation entry for an absolute/PC-
>> >> >> > > > relative (I can use "most" here)
>> >> >> > > > relocation (R_ARM_THM_PC8, R_AARCH64_ADR_PREL_PG_HI21, R_X86_64_64,
>> >> >> > > > local exec TLS relocation types, ...)
>> >> >> > > > Ignoring the addend (using -2 everywhere) will break this consistency.
>> >> >> > > >
>> >> >> > > > The relocated code may do pointer subtraction which would work if
>> >> >> > addends
>> >> >> > > > were
>> >> >> > > > respected, but will break using -2 everywhere.
>> >> >> > >
>> >> >> > > I suspect David meant "any harm to using -2 in all .debug_* sections?"
>> >> >> > > and not literally everywhere.  Sony does special cases only for the
>> >> >> > > .debug_* sections.
>> >> >> >
>> >> >> > Right - thanks for the clarification.
>> >> >> >
>> >> >> > > I've been meaning to propose that DWARF v6 reserve a special address for
>> >> >> > > this kind of situation.  Whether the committee would be willing to make
>> >> >> > > it be -1 or -2 for all targets, or make it target-defined, I don't know.
>> >> >> > > (Dreading the inevitable argument over whether addresses are signed or
>> >> >> > > unsigned, or more to the point whether they wrap.  They've been unsigned
>> >> >> > > and wrapping was undefined on the small set of machines I'm familiar
>> >> >> > with.)
>> >> >> > > Certainly the toolchain community would benefit from making it be the
>> >> >> > > same everywhere.
>> >> >> > >
>> >> >> > > Personally I'd vote for -1, and make pre-v5 .debug_loc/.debug_ranges
>> >> >> > > sections be an extra-special case using -2.  We can (I hope) standardize
>> >> >> > > on -1 for v6 onward, and document -1/-2 on the DWARF wiki as recommended
>> >> >> > > practice for prior versions.
>> >> >> >
>> >> >> > That'd make linking difficult - the unix linkers at least, currently
>> >> >> > don't have to identify the DWARF version when linking - having to pass
>> >> >> > an extra linking flag or have the linker parse any DWARF (what if an
>> >> >> > object file contains more than one CU & the linker has to apply
>> >> >> > different relocations in different parts of the object file because of
>> >> >> > that?) would be a significant cost/problem, I think.
>> >> >> >
>> >> >> > Though I like the tidiness of -1 everywhere, that backwards
>> >> >> > compatibility with debug_ranges (& debug_loc similarly) is a problem.
>> >> >> > Though ld.bfd does special case debug_ranges (& should special case
>> >> >> > debug_loc), perhaps that's the solution. -2 for debug_ranges and
>> >> >> > debug_loc, -1 everywhere else (which effectively means everywhere in
>> >> >> > DWARFv5 onwards)?
>> >> >>
>> >> >> Exactly.  Base it on the section name, .debug_loc and .debug_ranges
>> >> >> use -2 and all other .debug_* use -1.  No explicit version check needed.
>> >> >>
>> >> >> In terms of *specification*, DWARF v6 would say to use -1, and the
>> >> >> best-practices on the wiki would say "use -1, except for .debug_loc
>> >> >> and .debug_ranges use -2."
>> >> >
>> >> >Sounds pretty good to me.
>> >>
>> >> Looks good to me, too.
>> >>
>> >> >
>> >> >Ray - how do you feel about that? Do you think that's something lld
>> >> >would be able to do?
>> >>
>> >> This seems fine. If my understanding is correct, for an R_X86_64_64
>> >> referencing sym + addend, the relocated value is:
>> >>
>> >> if is_defined(sym)
>> >>    return addr(sym) + addend
>> >> if relocated_section is .debug_ranges or .debug_loc
>> >>    return -2 + addend
>> >>
>> >> // Every DWARF v5 section falls here
>> >> return -1 + addend
>> >>
>> >>
>> >> I still want addend to take part in the computation. This makes
>> >> subtraction sound and provides a bit more information (the original
>> >> length).
>> >>
>> >> For Alexey's example, if we see [0xfffffffffffffffe, 0x0000000000000004)
>> >> in .debug_ranges, we know that the original length was 6.  DWARF
>> >> consumers should allow [0xfffffffffffffffe, *) (but reject other
>> >> (low > high) pairs) and emit no diagnostic (DWARFAddressRange::valid() needs an update).
>> >
>> >Ah, unfortunately, I don't think it would be OK to leave the addend. I
>> >believe it needs to be handled sort of the way bfd ld does it (but -2
>> >or -1, instead of 0 or 1) - dropping the addend.
>> >
>> >Otherwise we'd be at risk of having both the start and end of the
>> >function having non-zero addend (if they were in a comdat group with
>> >some other code/another function, for instance) and then both would
>> >wraparound and be indistinguishable from normal address ranges.
>> >
>> >eg:
>> >
>> >__attribute__((section(".text.x"))) void f1() { }
>> >__attribute__((section(".text.x"))) void f2() { }
>> >int main() { }
>> >$ clang++ rng.cpp -fuse-ld=lld -Wl,-gc-sections -g && llvm-dwarfdump a.out
>> >DW_TAG_compile_unit
>> >   DW_AT_ranges      (0x00000000
>> >    [0x0000000000000000, 0x0000000000000016)
>> >    [0x0000000000400540, 0x0000000000400548))
>> >...
>> >  DW_TAG_subprogram
>> >    DW_AT_low_pc    (0x0000000000000000)
>> >    DW_AT_high_pc   (0x0000000000000006)
>> >    DW_AT_name      ("f1")
>> >...
>> >  DW_TAG_subprogram
>> >    DW_AT_low_pc    (0x0000000000000010)
>> >    DW_AT_high_pc   (0x0000000000000016)
>> >    DW_AT_name      ("f2")
>> >...
>> >
>> >Linking this with gold or lld, leaves the low_pc of 'f2' non-zero (10,
>> >in my case). Because the addend is non-zero. That makes it not
>> >possible to identify as dead code - even if using -2, since the addend
>> >would still wrap it back around to a positive value.
>> >
>> >Not the /most/ realistic example - there are cases where clang puts
>> >multiple functions in the same comdat, though I think they still go in
>> >separate sections still - though maybe they don't actually need to be
>> >in separate sections, though, since they're comdat'd together.
>> >
>> >I don't think it's meaningful to talk about the length of a function
>> >that doesn't exist/there are no instructions for - so I don't think
>> >there's a loss in fidelity to not have that information in the final
>> >linked DWARF.
>>
>> Thanks for the (explicit section) example. There is an R_X86_64_64
>> referencing .text.x+16 relocating .debug_ranges (begining address
>> offset) and another relocating .debug_info (DW_AT_low_pc)
>
>Yeah, it's specifically the one in DW_AT_low_pc (+10) that's the
>problem. The one in .debug_ranges is the end of the range that starts
>at .text.x+0, so that particular pair (+0 to +16) would be OK with
>-1,+16 - but the low_pc of +10 would be ambiguous/unidentifiable as
>dead code.
>
>> It looks like we have to ignore the addend. Amended formula for a
>> relocated value in a non-SHF_ALLOC section:
>>
>>    if is_defined(sym)
>>       return addr(sym) + addend
>>    if relocated_section is .debug_ranges or .debug_loc
>>       return -2
>>
>>    // Every DWARF v5 section falls here
>>    return -1  # Question: why can't we use 0? [1]
>
>Yep, that'd be the theory (assuming changing the value for
>non-.debug_* sections from 0+addend to -1 is acceptable).
>
>> >> David> I guess we'd need to probably have a conversation with the DWARF
>> >> David> Committee and/or with debugger vendors (gdb and lldb) to ensure they'd
>> >> David> be willing to make matching changes to support this...
>> >> David>
>> >> David> and since they might not support it out of the gate, perhaps we'd need
>> >> David> it behind a flag for the current (albeit buggy) backwards
>> >> David> compatibility? Or maybe it works well enough without explicit support
>> >> David> already.
>> >>
>> >> Alexey> So after implementing this, some tools could potentially stop working.
>> >> Alexey> I do not know, such tools. So, I am not sure whether that is the
>> >> Alexey> problem.
>> >>
>> >> I hope we don't need a linker option (selecting -1+addend or 0+addend).
>> >> If unfortunately it can't be avoided, we probably need to discuss it
>> >> with binutils.  I can do that.
>> >>
>> >> Another thought.  I wonder whether non-debug non-SHF_ALLOC sections
>> >> expect addr(undefined) to be 0. If they can live with any value, and if
>> >> we don't need the -2 special case (to avoid collision with the base
>> >> address selection entry -1),
>> >
>> >I think we will need that special case due to DWARFv4 for a while yet.
>> >Though, yes - if the current addr(undefined) variance (0 for bfd,
>> >0+addend for lld/gold) extends beyond .debug_* sections, /maybe/ we
>> >could make it configurable with a default, while keeping the hardcoded
>> >exception for .debug_ranges and .debug_loc of using -2. Though I'd
>> >worry a bit about more open-ended use cases for the non-debug
>> >sections. At least for the debug sections we have a pretty good idea
>> >of which consumers to go and talk to/test with, etc.
>>
>> [1] I forgot to ask why we can't resolve addr(unsigned)+addend to 0 in
>> .debug_* (excluding .debug_ranges and .debug_loc).
>
>The initial reason was to avoid terminatinating .debug_loc lists and
>.debug_range lists early - but that doesn't apply since we are talking
>about carving out -2 specially for them anyway.
>
>I believe the other possible reason is for a system that uses 0 as a
>valid executable address.
>
>> Paul> As has been mentioned elsewhere, Sony generally fixes up references from
>> Paul> debug info to stripped functions (of any length) using -1, because
>> Paul> that’s a less-likely-to-be-real address than 0x0 or 0x1.  (0x0 is a
>> Paul> typical base address for shared libraries, I’d think using it has the
>> Paul> potential to mislead various consumers.)  For .debug_ranges we use -2,
>> Paul> because both a 0/0 pair and a -1/-1 pair have a reserved meaning in that
>> Paul> section.
>>
>> 0 in an ET_DYN object refers to the base address, where the ELF header
>> (or headers of other binary formats) resides. The ELF header is unlikely
>> to be a meaningful code sequence. In LLD and GNU ld -z separate-code's
>> layout, it is mapped to a (non-executable) PF_R PT_LOAD segment.
>>
>> I suspect special cased 0 values may have been baked into some DWARF
>> consumers.
>>
>> Do other binary formats/platforms allow the base address to be a
>> meaningful place that we can't lose DWARF for? (If 0 is a meaningful
>> place but its debuggability does not matter that much, we can still use
>> 0 as a special value, losing the debuggability of address 0).
>
>Yeah, that's the thing - while it might not be an issue for ELF, DWARF
>would want a standard that's fairly resilient to quirky/interesting
>use cases (admittedly - such platforms could equally want to make
>their executable code way up in the address space near max or max - 1,
>etc?).
>
>I know Alexey was particularly interested in the problem of
>low-address functions overlapping with the [0, length) ranges created
>by dead code with gold/lld - but I'm not sure if he has a use case for
>literal zero, or just "low but non-zero". Hopefully he can weigh in
>here.
>
>> >> we can add a more generic option
>> >>
>> >>    -z undef-address-in-nonalloc=-1
>> >>
>> >> Applying to every relocation referencing an undefined symbol in a
>> >> non-SHF_ALLOC section. Let -1 be the default value to make .debug_*
>> >> (excluding .debug_ranges and .debug_loc) happy.
>> >
>> >If we end up blessing it as part of the DWARF spec, we probably
>> >wouldn't want it to be user-configurable for the .debug_ sections, so
>> >I'd hesitate to add that configurability to the linker lest we have to
>> >revoke it to conform to DWARF (breaking flag compatibility with
>> >previous versions of the linker, etc). Admittedly we'll be breaking
>> >output compatibility with this change regardless, so potentially
>> >having the flag as an escape hatch could be useful.
>>
>> For these open-ended non-debug non-SHF_ALLOC sections, some use section
>> groups/SHF_LINK_ORDER to solve the problem. I can't think of an example
>> of the rest. The flag is probably not going to be useful.
>>
>> We need to ask binutils whether they can think of some examples the
>> output compatibility matters.
>
>*nod* wouldn't hurt to know/get some feedback on.

Sent 
https://sourceware.org/pipermail/binutils/2020-May/111358.html
cross posted to gdb at sourceware.org and elfutils-devel at sourceware.org
(I came across https://sourceware.org/elfutils/Debuginfod.html recently
it seems that elfutils have a bunch of DWARF-aware programs,
so sent it as well. In case we need a linker option, I hope GNU ld, gold
and LLD can be compatible)

(I never see posts cross posting to llvm-dev & binutils, so I did not do
that :) )

I can CC folks there when needed, assuming people who have participated
in the discussion here will not object to be CCed :)

>>
>> >
>> >> Link a program twice, with -z undef-sym=-1 and -z undef-sym=-2,
>> >> respectively. A binary diff can reveal relocation entries referencing
>> >> undefined symbols.
>> >>
>> >> >
>> >> >> --paulr
>> >> >>
>> >> >> >
>> >> >> > >
>> >> >> > >
>> >> >> > > >
>> >> >> > > > The relocated code can be allocatable or not. Non-allocatable non-
>> >> >> > debug
>> >> >> > > > code can have meaningful pointer subtraction as well. This is why I am
>> >> >> > > > not too fond of (using a fixed value everywhere).
>> >> >> > > >
>> >> >> > > > >(also, I had a silly idea, but what would happen if we added a CU
>> >> >> > > > attribute
>> >> >> > > > >with an address value that was a reference to a weak always-unused
>> >> >> > > > symbol,
>> >> >> > > > >that way the linker would fix it up with whatever its preferred magic
>> >> >> > > > value
>> >> >> > > > >was, and the consumer would then know what the magic value was that
>> >> >> > > > >represented dead code? (though this would only work if the value were
>> >> >> > > > used
>> >> >> > > > >consistently everywhere - which is zero for gold/lld (well, almost...
>> >> >> > you
>> >> >> > > > >can still create situations where a non-zero value is used even for a
>> >> >> > > > >low_pc), but wouldn't work for binutils ld (1 in debug_ranges, 0
>> >> >> > > > elsewhere)
>> >> >> > > > >or Sony (-2 in debug_ranges, -1 elsewhere)... - so, wouldn't actually
>> >> >> > > > work
>> >> >> > > > >for any producer currently, so maybe there's little value in that as
>> >> >> > a
>> >> >> > > > >feature))
>> >> >> > > >
>> >> >> > > > For a non-SHF_ALLOC section, LLD currently considers it a GC root if
>> >> >> > all
>> >> >> > > > the conditions below are satisfied:
>> >> >> > > >
>> >> >> > > > * not SHT_REL[A]
>> >> >> > > > * not SHF_LINK_ORDER
>> >> >> > > > * not in a section group
>> >> >> > > >
>> >> >> > > > (I managed to lobby the ideas to GNU ld. GNU ld from binutils 2.35
>> >> >> > > > onwards will have mostly compatible semantics with LLD)
>> >> >> > > >
>> >> >> > > > There is a cost fragmenting a .debug_* section: sizeof(Elf64_Shdr)=64
>> >> >> > ->
>> >> >> > > > each section takes 64 bytes in the section header table.
>> >> >> >
>> >> >> > Soryr, I missed a step here - you're talking about the cost to
>> >> >> > fragmenting .debug_* sections as an alternative to choosing a special
>> >> >> > address value to resolve for dead code? (by removing the DWARF that
>> >> >> > refers to the dead code, uinstead of keeping it and having to write a
>> >> >> > special address value into it?)
>> >> >> >
>> >> >> > Unfortunately, no matter the cost - that solution doesn't apply to
>> >> >> > Split DWARF. Maybe at some point we'll want to have some output from
>> >> >> > the linker that lists the dead/live code, and use that for building a
>> >> >> > dwp (like dsymutil) but I don't think we can predicate correctness on
>> >> >> > such a thing - in part because we'd still want to be able to read the
>> >> >> > .dwo files without post-processing for more interactive/iterative
>> >> >> > development scenarios. So we'd still need a special address value to
>> >> >> > write into debug_addr when using Split DWARF, and I think it's
>> >> >> > important to allow the non-split case to look like the split case
>> >> >> > where it doesn't /have/ to diverge - if divergence provides benefits,
>> >> >> > that's nice, but I don't think it'd be good to make that divergence
>> >> >> > /necessary/.
>> >> >> >
>> >> >> > > > SHF_LINK_ORDER
>> >> >> > > > has semantics of a lightweight section group. Assume we don't want to
>> >> >> > > > have one .debug_* for each function section, this .debug_* will be a
>> >> >> > GC
>> >> >> > > > root. Relocations from it (even if the symbol is weak) will retain the
>> >> >> > > > sections defining the symbols.
>> >> >> > >
>> >> >> > > We did some quick research into per-function .debug_info fragments a
>> >> >> > > while back, putting the subprogram info into the same section group as
>> >> >> > > the function; it was not an unqualified win.  The very large number of
>> >> >> > > sections costs processing time, and cross-section references added to
>> >> >> > > the relocation count (I believe these can generally be resolved by MC
>> >> >> > > in a non-fragmented .debug_info section).
>> >> >> >
>> >> >> > Yeah, you either pay more relocations (or size cost to use signatures
>> >> >> > instead) and/or more size to duplicate some DIEs (like type units
>> >> >> > currently duplicate fundamental/non-user-defined types into the type
>> >> >> > unit).
>> >> >> >
>> >> >> > (& it's a non-starter for Split DWARF anyway)
>> >> >> >
>> >> >> > >  James Henderson might have
>> >> >> > > the actual results stashed somewhere.
>> >> >> > >
>> >> >> > > That approach *might* still be faster than post-processing a unified
>> >> >> > > section, which IIUC is what D59553 does.
>> >> >> > >
>> >> >> > > >
>> >> >> > > > So, this trick can't work without refining the --gc-sections rules
>> >> >> > > > further.
>> >> >> > >
>> >> >> > > If I understand the objection, yeah, we can't have .debug_* sections
>> >> >> > > being gc roots.
>> >> >> >
>> >> >> > I'm not sure I follow, here - "this trick" being "splitting debug info
>> >> >> > into droppable chunks for each function/subprogram and putting those
>> >> >> > chunks in the same comdat group as the function code itself" - that
>> >> >> > trick wouldn't work because the current rules would move that debug
>> >> >> > info from (where it currently is in one big debug_info section) a
>> >> >> > non-gc root to (where it would go - becoming part of a comdat group) a
>> >> >> > gc-root? OK, right. Agreed that's another reason (apart from the Split
>> >> >> > DWARF one, and the size/reloc tradeoff ones) that would be
>> >> >> > problematic.
>> >> >> >
>> >> >> > - Dave
>> >> >> >
>> >> >> > >
>> >> >> > > --paulr
>> >> >> > >
>> >> >> > > >
>> >> >> > > > >
>> >> >> > > > >>  If you’re looking only at zero-length functions, you can stop
>> >> >> > there;
>> >> >> > > > but
>> >> >> > > > >> I’m not sure why stopping there solves much of a real problem, as
>> >> >> > > > >> zero-length functions seem like a weird corner case.
>> >> >> > > > >>
>> >> >> > > > >
>> >> >> > > > >They're the case that breaks existing usage by terminating the range
>> >> >> > list
>> >> >> > > > >early - the other existing usage seems to be fine with "resolve to
>> >> >> > > > addend"
>> >> >> > > > >strategy that lld and gold use - in that it moves most
>> >> >> > dead/deduplicated
>> >> >> > > > >functions outside the executable range and so consumers never come
>> >> >> > asking
>> >> >> > > > >for "what code is at instruction 5" because they're never executing
>> >> >> > code
>> >> >> > > > at
>> >> >> > > > >a pc of 5. But, yes, this existing solution doesn't work once you
>> >> >> > have
>> >> >> > > > code
>> >> >> > > > >mapped into low address spaces or have utterly massive functions that
>> >> >> > > > might
>> >> >> > > > >have a length that would reach into the executable address space even
>> >> >> > > > when
>> >> >> > > > >their start is remapped to zero.
>> >> >> > > >
>> >> >> > > > For posterity, David gave me an example offline: void f1() { } void
>> >> >> > f2() {
>> >> >> > > > } int main() { f1(); }
>> >> >> > > >
>> >> >> > > > clang -fuse-ld=bfd -ffunction-sections -Wl,--gc-sections -g a.c -o
>> >> >> > a.bfd
>> >> >> > > > llvm-dwarfdump -debug-ranges a.bfd
>> >> >> > > > =>
>> >> >> > > > R_X86_64_64 relocations in .debug_ranges are resolved to 1, ignoring
>> >> >> > the
>> >> >> > > > addend
>> >> >> > > >
>> >> >> > > > (Behavior introduced in
>> >> >> > > > https://urldefense.com/v3/__https://sourceware.org/git/?p=binutils-
>> >> >> > __;!!JmoZiZGBv3RvKRSx!tvZHOB67bddZqG6U17Cj0X1rfKGW72rrjmKKFQNBxBlQvgdukM5e
>> >> >> > 3DkTDxuwRVwsyw$
>> >> >> > > >
>> >> >> > gdb.git;a=blobdiff;f=bfd*ChangeLog;h=8fbaed21fa2c8238459acb637545583f3cfbb
>> >> >> > > >
>> >> >> > fdf;hp=18a3a67be3a5980998c4461b5a739e54f3551b17;hb=e4067dbb2a3368dbf908b39
>> >> >> > > > c5435c84d51abc9f3;hpb=c0621d88b096cc046adf6ed484baea9ba5bfe721)
>> >> >> > > >
>> >> >> > > > The comments below are also insightful. I need to ponder more (and
>> >> >> > need
>> >> >> > > > to read the DWARF v4 and v5 specs more as I am not so familiar these
>> >> >> > > > DWARF constructs). But it is too late now. Will probably comment
>> >> >> > > > another day :)
>> >> >> > > >
>> >> >> > > > >
>> >> >> > > > >> Linkers know how to strip dead functions (gc) or deduplicate them
>> >> >> > (icf,
>> >> >> > > > >> COMDAT) and people do this all the time, in some cases (COMDAT)
>> >> >> > without
>> >> >> > > > >> explicitly asking for it, so non-zero-length functions seem like
>> >> >> > the
>> >> >> > > > much
>> >> >> > > > >> more interesting case.  In that situation, -1 (or -2) seems like a
>> >> >> > much
>> >> >> > > > >> wiser choice of blessed-as-not-real address, versus 0x0 or 0x1.
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >> Stripping non-zero-length functions does mean you have to care
>> >> >> > about
>> >> >> > > > more
>> >> >> > > > >> sections.  For example .debug_locs would want to be fixed up the
>> >> >> > same
>> >> >> > > > way
>> >> >> > > > >> as .debug_ranges, not because a debugger would care but so that
>> >> >> > dumpers
>> >> >> > > > >> would not run into the 0/0 brick wall.
>> >> >> > > > >>
>> >> >> > > > >
>> >> >> > > > >Yep - in theory a consumer could actually use a loclist across
>> >> >> > multiple
>> >> >> > > > >sections (if a global variable got hoisted into a register for a
>> >> >> > function
>> >> >> > > > >for instance), but I don't know of any producers doing this today -
>> >> >> > until
>> >> >> > > > >then, yeah, it's just a dumping problem and ld.bfd does produce DWARF
>> >> >> > > > that
>> >> >> > > > >has that problem (because it resolves both relocations to dead code
>> >> >> > > > >(begin/end of a range) to zero in all sections except debug_ranges,
>> >> >> > so
>> >> >> > > > >terminates the loclist list early) - binutils objdump avoids dumping
>> >> >> > the
>> >> >> > > > >following corrupted fragment by only dumping hunks of debug_loc
>> >> >> > starting
>> >> >> > > > at
>> >> >> > > > >places referenced from debug_info. Without debug_info it won't dump
>> >> >> > > > >anything from debug_loc - and if the references from debug_info,
>> >> >> > parsed
>> >> >> > > > >until the 0,0 terminator don't cover the whole debug_loc section, it
>> >> >> > > > prints
>> >> >> > > > >messages saying there are "gaps".
>> >> >> > > > >
>> >> >> > > > >Agreed that you'd want debug_loc to have the same special handling as
>> >> >> > > > >debug_ranges if it has special handling. Though ideally we'd pick a
>> >> >> > value
>> >> >> > > > >that works equally everywhere? (-2, by the sounds of it)
>> >> >> > > > >
>> >> >> > > > >
>> >> >> > > > >> We also fix up lengths in .debug_aranges to zero, although there
>> >> >> > might
>> >> >> > > > be
>> >> >> > > > >> history behind that tactic that I’m not aware of; it seems like it
>> >> >> > > > ought to
>> >> >> > > > >> be unnecessary, if consumers are aware of the special address(es).
>> >> >> > > > >>
>> >> >> > > > >
>> >> >> > > > >Yeah, no idea about debug_aranges... I'd have thought it'd be fine
>> >> >> > with
>> >> >> > > > the
>> >> >> > > > >same approach as debug_ranges, but I haven't looked at debug_aranges
>> >> >> > in a
>> >> >> > > > >long time.
>> >> >> > > > >
>> >> >> > > > >I guess the only remaining question is: Since it's possible to have
>> >> >> > code
>> >> >> > > > on
>> >> >> > > > >some systems down at address zero, or close enough to it that [0,
>> >> >> > length)
>> >> >> > > > >might overlap with real exxecutable code addresses - does anyone know
>> >> >> > of
>> >> >> > > > >the inverse: where code is mapped up near uint32 max? Such that that
>> >> >> > > > usage
>> >> >> > > > >wouldn't be able to sacrifice uint32 max - 1 to use as a blessed
>> >> >> > value
>> >> >> > > > here?
>> >> >> > > > >
>> >> >> > > > >- Dave
>> >> >> > > > >
>> >> >> > > > >
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >> --paulr
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >> *From:* Alexey Lapshin <alapshin at accesssoftek.com>
>> >> >> > > > >> *Sent:* Thursday, May 28, 2020 9:03 AM
>> >> >> > > > >> *To:* Sriraman Tallam <tmsriram at google.com>; Wei Mi
>> >> >> > <wmi at google.com>;
>> >> >> > > > >> Robinson, Paul <paul.robinson at sony.com>; Adrian Prantl
>> >> >> > > > <aprantl at apple.com>;
>> >> >> > > > >> Jonas Devlieghere <jdevlieghere at apple.com>; Alexey Lapshin <
>> >> >> > > > >> a.v.lapshin at mail.ru>; Eric Christopher <echristo at gmail.com>;
>> >> >> > Fangrui
>> >> >> > > > Song
>> >> >> > > > >> <maskray at google.com>; David Blaikie <dblaikie at gmail.com>;
>> >> >> > > > >> llvm-dev at lists.llvm.org
>> >> >> > > > >> *Subject:* Re: [llvm-dev] Range lists, zero-length functions,
>> >> >> > linker gc
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >> Hi David,
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >> >So there have been several recent discussions about the issues
>> >> >> > around
>> >> >> > > > >>
>> >> >> > > > >> >DWARF-agnostic linking and gc-sections, linkonce function
>> >> >> > definitions
>> >> >> > > > >> being
>> >> >> > > > >>
>> >> >> > > > >> >dropped, etc - and just how much DWARF-awareness would be suitable
>> >> >> > > > >>
>> >> >> > > > >> >in a linker to help with this situation.
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >> > I'd like to discuss a narrower instance of this issue: Zero
>> >> >> > length
>> >> >> > > > >> gc'd/deduplicated functions.
>> >> >> > > > >>
>> >> >> > > > >> > LLVM seems to at least produce zero length functions in a few
>> >> >> > cases:
>> >> >> > > > >> > * non-void function without a return statement
>> >> >> > > > >> > * function definition containing only llvm_unreachable
>> >> >> > > > >> > (both of these trap at -O0, but at higher optimization levels
>> >> >> > even
>> >> >> > > > the
>> >> >> > > > >> trap
>> >> >> > > > >>
>> >> >> > > > >> > instruction is removed & you get the full power UB of control
>> >> >> > > > >> flowing off
>> >> >> > > > >>
>> >> >> > > > >> > the end of the function into whatever other bytes are after that
>> >> >> > > > >> function)
>> >> >> > > > >>
>> >> >> > > > >> > So, for context, debug_ranges (this whole issue doesn't exist in
>> >> >> > > > >> DWARFv5,
>> >> >> > > > >>
>> >> >> > > > >> > FWIW) is a list of address pairs, terminated by a pair of zeros.
>> >> >> > > > >>
>> >> >> > > > >> > With function sections, or even just with normal C++ inline
>> >> >> > > > functions,
>> >> >> > > > >>
>> >> >> > > > >> > the CU will have a range entry for that function that consists of
>> >> >> > two
>> >> >> > > > >> relocations
>> >> >> > > > >>
>> >> >> > > > >> > - to the start and end of the function. Generally the start of
>> >> >> > the
>> >> >> > > > >> function is the
>> >> >> > > > >>
>> >> >> > > > >> > start of the section, and the end is "start of function + length
>> >> >> > of
>> >> >> > > > >> function (aka addend)".
>> >> >> > > > >>
>> >> >> > > > >> >  Usually any relocation to the section would keep that section
>> >> >> > > > "alive"
>> >> >> > > > >> during linking -
>> >> >> > > > >>
>> >> >> > > > >> > but that would cause debug info to defeat linker GC and
>> >> >> > > > deduplication.
>> >> >> > > > >> So there's
>> >> >> > > > >>
>> >> >> > > > >> > special rules for how linkers handle these relocations in debug
>> >> >> > info
>> >> >> > > > to
>> >> >> > > > >> allow the
>> >> >> > > > >>
>> >> >> > > > >> > sections to be dropped - what do you write in the bytes that
>> >> >> > > > requested
>> >> >> > > > >> the relocation?
>> >> >> > > > >>
>> >> >> > > > >> > Binutils ld: Special cases only debug_ranges, resolving all
>> >> >> > > > relocations
>> >> >> > > > >> to dead
>> >> >> > > > >>
>> >> >> > > > >> > code to 1. In other debug sections, these values are all resolved
>> >> >> > to
>> >> >> > > > >> zero.
>> >> >> > > > >>
>> >> >> > > > >> > Gold and lld: Special cases all debug info sections - resolving
>> >> >> > all
>> >> >> > > > >> relocations
>> >> >> > > > >>
>> >> >> > > > >> > to "addend" (so begin usually goes to zero, end goes to "size of
>> >> >> > > > >> function")
>> >> >> > > > >>
>> >> >> > > > >> > These special rules are designed to ensure
>> >> >> > omitted/gc'd/deduplicated
>> >> >> > > > >> functions
>> >> >> > > > >>
>> >> >> > > > >> > don't cause the range list to terminate prematurely (which would
>> >> >> > > > happen
>> >> >> > > > >> if begin/end
>> >> >> > > > >>
>> >> >> > > > >> > were both resolved to zero).
>> >> >> > > > >>
>> >> >> > > > >> >But with an empty function, gold and lld's strategy here fails to
>> >> >> > > > avoid
>> >> >> > > > >> terminating a
>> >> >> > > > >>
>> >> >> > > > >> >range list by accident.
>> >> >> > > > >>
>> >> >> > > > >> > What should we do about it?
>> >> >> > > > >>
>> >> >> > > > >> >  1) Ensure no zero-length functions exist? (doesn't address
>> >> >> > backwards
>> >> >> > > > >>
>> >> >> > > > >> > compatibility/existing functions/other compilers)
>> >> >> > > > >> > 2) adopt the binutils approach to this (at least in debug_ranges
>> >> >> > -
>> >> >> > > > maybe
>> >> >> > > > >> in all
>> >> >> > > > >>
>> >> >> > > > >> > debug sections? (doing it in other sections could break )
>> >> >> > > > >> >  3) Revisit the discussion about using an even more 'blessed'
>> >> >> > value,
>> >> >> > > > >>
>> >> >> > > > >> > like int max-1? (
>> >> >> > > >
>> >> >> > https://urldefense.com/v3/__https://reviews.llvm.org/D59553__;!!JmoZiZGBv3
>> >> >> > > > RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL-
>> >> >> > 1Zdb4OydT2xVhpDTRyUixtaozLYiewZqMLtoA$
>> >> >> > > > >>
>> >> >> > > >
>> >> >> > <https://urldefense.com/v3/__https:/reviews.llvm.org/D59553__;!!JmoZiZGBv3
>> >> >> > > >
>> >> >> > RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY63HSqgMOg$
>> >> >> > > > >
>> >> >> > > > >>  )
>> >> >> > > > >>
>> >> >> > > > >> >  (I don't have links to all the recent threads about this
>> >> >> > discussion
>> >> >> > > > - I
>> >> >> > > > >> think D59553
>> >> >> > > > >>
>> >> >> > > > >> > might've spawned a separate broader discussion/non-review - oh,
>> >> >> > > > Alexey
>> >> >> > > > >> wrote a
>> >> >> > > > >>
>> >> >> > > > >> > good summary with links to other discussions here:
>> >> >> > > > >>
>> >> >> > > > >> >
>> >> >> > https://urldefense.com/v3/__http://lists.llvm.org/pipermail/llvm-
>> >> >> > > > dev/2019-
>> >> >> > September/135068.html__;!!JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ-
>> >> >> > > > QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiey_aMV0lQ$
>> >> >> > > > >> <https://urldefense.com/v3/__http:/lists.llvm.org/pipermail/llvm-
>> >> >> > > > dev/2019-
>> >> >> > > >
>> >> >> > September/135068.html__;!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE
>> >> >> > > > 70_57b4_rsj1TN0qB8NpBvVKtY638NIRu2g$>
>> >> >> > > > >>  )
>> >> >> > > > >>
>> >> >> > > > >> > Thoughts?
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >> I think for the problem of "zero length functions and
>> >> >> > .debug_ranges"
>> >> >> > > > >> binutils approach looks good:
>> >> >> > > > >>
>> >> >> > > > >> >Special cases only debug_ranges, resolving all relocations to
>> >> >> > > > >> >dead code to 1. In other debug sections, these values are all
>> >> >> > resolved
>> >> >> > > > to
>> >> >> > > > >> >zero.
>> >> >> > > > >>
>> >> >> > > > >> But, this would not completely solve the problem from
>> >> >> > > > >>
>> >> >> > > >
>> >> >> > https://urldefense.com/v3/__https://reviews.llvm.org/D59553__;!!JmoZiZGBv3
>> >> >> > > > RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL-
>> >> >> > 1Zdb4OydT2xVhpDTRyUixtaozLYiewZqMLtoA$
>> >> >> > > > >>
>> >> >> > > >
>> >> >> > <https://urldefense.com/v3/__https:/reviews.llvm.org/D59553__;!!JmoZiZGBv3
>> >> >> > > >
>> >> >> > RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY63HSqgMOg$
>> >> >> > > > >
>> >> >> > > > >> - Overlapped address ranges. Binutils approach will solve the
>> >> >> > problem
>> >> >> > > > if
>> >> >> > > > >> the address range specified as start_address:end_address. While
>> >> >> > > > resolving
>> >> >> > > > >> relocations, it would replace such a range with 1:1.
>> >> >> > > > >> However, It would not work if address ranges were specified as
>> >> >> > > > >> start_address:length since the length is not relocated. This case
>> >> >> > could
>> >> >> > > > be
>> >> >> > > > >> additionally fixed by fast scan debug_info for High_PC defined as
>> >> >> > > > length
>> >> >> > > > >> and changing it to 1. Something which you suggested here:
>> >> >> > > > >> https://urldefense.com/v3/__http://lists.llvm.org/pipermail/llvm-
>> >> >> > > > dev/2020-May/141599.html__;!!JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ-
>> >> >> > > > QbL7LtpIL-1Zdb4OydT2xVhpDTRyUixtaozLYiexb8NU_Fw$
>> >> >> > > > >> <https://urldefense.com/v3/__http:/lists.llvm.org/pipermail/llvm-
>> >> >> > > > dev/2020-
>> >> >> > > >
>> >> >> > May/141599.html__;!!JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b
>> >> >> > > > 4_rsj1TN0qB8NpBvVKtY63PsubKJQ$>
>> >> >> > > > >> .
>> >> >> > > > >>
>> >> >> > > > >> So it looks like following solution could fix both problems and be
>> >> >> > > > >> relatively fast:
>> >> >> > > > >>
>> >> >> > > > >> "Resolve all relocations from debug sections into dead code to 1.
>> >> >> > Parse
>> >> >> > > > >> debug sections and replace HighPc of an address range pointing to
>> >> >> > dead
>> >> >> > > > code
>> >> >> > > > >> and specified as length to 1".
>> >> >> > > > >>
>> >> >> > > > >> As the result all address ranges pointing into dead code would be
>> >> >> > > > marked
>> >> >> > > > >> as zero length.
>> >> >> > > > >>
>> >> >> > > > >> There still exist another problem:
>> >> >> > > > >>
>> >> >> > > > >> DWARF4: "A range list entry (but not a base address selection or
>> >> >> > end of
>> >> >> > > > >> list entry) whose beginning and
>> >> >> > > > >> ending addresses are equal has no effect because the size of the
>> >> >> > range
>> >> >> > > > >> covered by such an
>> >> >> > > > >> entry is zero."
>> >> >> > > > >>
>> >> >> > > > >> DWARF5: "A bounded range entry whose beginning and ending address
>> >> >> > > > offsets
>> >> >> > > > >> are equal
>> >> >> > > > >> (including zero) indicates an empty range and may be ignored."
>> >> >> > > > >>
>> >> >> > > > >> These rules allow us to ignore zero-length address ranges. I.e.,
>> >> >> > some
>> >> >> > > > tool
>> >> >> > > > >> reading DWARF is permitted to ignore related DWARF entries. In that
>> >> >> > > > case,
>> >> >> > > > >> there could be ignored essential descriptions. That problem could
>> >> >> > > > happen
>> >> >> > > > >> with -flto=thin example
>> >> >> > > >
>> >> >> > https://urldefense.com/v3/__https://reviews.llvm.org/D54747*1503720__;Iw!!
>> >> >> > > > JmoZiZGBv3RvKRSx!sRjL4Vdx9oC8TPFhKZ-QbL7LtpIL-
>> >> >> > > > 1Zdb4OydT2xVhpDTRyUixtaozLYiezSujGHwQ$
>> >> >> > > > >>
>> >> >> > > >
>> >> >> > <https://urldefense.com/v3/__https:/reviews.llvm.org/D54747*1503720__;Iw!!
>> >> >> > > >
>> >> >> > JmoZiZGBv3RvKRSx!r2Jqc2yEgxrb2QcQEocDHJBizj0KUKE70_57b4_rsj1TN0qB8NpBvVKtY
>> >> >> > > > 637ju_eQw$>
>> >> >> > > > >> . In this example, all type definitions except one were replaced
>> >> >> > with
>> >> >> > > > >> declarations by thinlto. The definition, which was left, is in a
>> >> >> > piece
>> >> >> > > > of
>> >> >> > > > >> debug info related to deleted code. According to zero-length rule,
>> >> >> > that
>> >> >> > > > >> definition could be ignored, and finally, incomplete debug info
>> >> >> > could
>> >> >> > > > be
>> >> >> > > > >> used.
>> >> >> > > > >>
>> >> >> > > > >> So, it probably should be forbidden to generate debug_info, which
>> >> >> > could
>> >> >> > > > >> become incomplete after removing pieces related to zero length
>> >> >> > address
>> >> >> > > > >> ranges. Otherwise, creating zero-length address ranges could lead
>> >> >> > to
>> >> >> > > > >> incomplete debug info.
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >> Thank you, Alexey.
>> >> >> > > > >>
>> >> >> > > > >>
>> >> >> > > > >>


More information about the llvm-dev mailing list