[llvm-dev] Range lists, zero-length functions, linker gc

Eric Christopher via llvm-dev llvm-dev at lists.llvm.org
Thu May 28 17:39:23 PDT 2020


(FWIW I'm following this, but also haven't disagreed with anything Dave has
said yet :)

On Thu, May 28, 2020 at 3:55 PM David Blaikie <dblaikie at gmail.com> wrote:

> On Thu, May 28, 2020 at 6:03 AM Alexey Lapshin <alapshin at accesssoftek.com>
> wrote:
>
>> Hi David,
>>
>>
>> >So there have been several recent discussions about the issues around
>>
>> >DWARF-agnostic linking and gc-sections, linkonce function definitions
>> being
>>
>> >dropped, etc - and just how much DWARF-awareness would be suitable
>>
>> >in a linker to help with this situation.
>>
>> > I'd like to discuss a narrower instance of this issue: Zero length
>> gc'd/deduplicated functions.
>>
>> > LLVM seems to at least produce zero length functions in a few cases:
>> > * non-void function without a return statement
>> > * function definition containing only llvm_unreachable
>> > (both of these trap at -O0, but at higher optimization levels even the
>> trap
>> > instruction is removed & you get the full power UB of control
>> flowing off
>> > the end of the function into whatever other bytes are after that
>> function)
>>
>> > So, for context, debug_ranges (this whole issue doesn't exist in
>> DWARFv5,
>> > FWIW) is a list of address pairs, terminated by a pair of zeros.
>> > With function sections, or even just with normal C++ inline functions,
>> > the CU will have a range entry for that function that consists of two
>> relocations
>> > - to the start and end of the function. Generally the start of the
>> function is the
>> > start of the section, and the end is "start of function + length of
>> function (aka addend)".
>>
>> >  Usually any relocation to the section would keep that section "alive"
>> during linking -
>> > but that would cause debug info to defeat linker GC and deduplication.
>> So there's
>> > special rules for how linkers handle these relocations in debug info to
>> allow the
>> > sections to be dropped - what do you write in the bytes that requested
>> the relocation?
>>
>> > Binutils ld: Special cases only debug_ranges, resolving all relocations
>> to dead
>> > code to 1. In other debug sections, these values are all resolved to
>> zero.
>> > Gold and lld: Special cases all debug info sections - resolving all
>> relocations
>> > to "addend" (so begin usually goes to zero, end goes to "size of
>> function")
>>
>> > These special rules are designed to ensure omitted/gc'd/deduplicated
>> functions
>> > don't cause the range list to terminate prematurely (which would happen
>> if begin/end
>> > were both resolved to zero).
>>
>> >But with an empty function, gold and lld's strategy here fails to avoid
>> terminating a
>> >range list by accident.
>>
>> > What should we do about it?
>>
>> >  1) Ensure no zero-length functions exist? (doesn't address backwards
>> > compatibility/existing functions/other compilers)
>> > 2) adopt the binutils approach to this (at least in debug_ranges -
>> maybe in all
>> > debug sections? (doing it in other sections could break )
>> >  3) Revisit the discussion about using an even more 'blessed' value,
>> > like int max-1? ( https://reviews.llvm.org/D59553 )
>>
>> >  (I don't have links to all the recent threads about this discussion -
>> I think D59553
>> > might've spawned a separate broader discussion/non-review - oh, Alexey
>> wrote a
>> > good summary with links to other discussions here:
>> >  http://lists.llvm.org/pipermail/llvm-dev/2019-September/135068.html )
>>
>> > Thoughts?
>>
>> I think for the problem of "zero length functions and .debug_ranges"
>> binutils approach looks good:
>>
>> >Special cases only debug_ranges, resolving all relocations to
>> >dead code to 1. In other debug sections, these values are all resolved to
>> >zero.
>>
>> But, this would not completely solve the problem from
>> https://reviews.llvm.org/D59553 - Overlapped address ranges. Binutils
>> approach will solve the problem if the address range specified as
>> start_address:end_address. While resolving relocations, it would replace
>> such a range with 1:1.
>> However, It would not work if address ranges were specified as
>> start_address:length since the length is not relocated.
>>
> This case could be additionally fixed by fast scan debug_info for High_PC
>> defined as length and changing it to 1. Something which you suggested here:
>> http://lists.llvm.org/pipermail/llvm-dev/2020-May/141599.html.
>>
>
> Hmm, I don't /think/ I intended to suggest anything that would have to
> parse all the debug_info, even if just to fixup high_pc. I meant that
> debug_rnglist for the CU at least (rnglist has fewer problems - you can't
> accidentally terminate it early, but still has the "large functions in
> programs that use relatively low code addresses can't just be resolved to
> "addend" because then [0, length) of the large function might overlap into
> that code address range") could be modified by a DWARF-aware linker to
> remove the unused chunks. The DWARF that describes a specific function
> using low_pc/high_pc - it may be split into a .dwo file and unreachable by
> the linker - so it /needs/ a magic value for the address referenced by the
> low_pc to indicate that it is invalid.
>
> Which all comes back to "we probably need to pick a value that's
> explicitly invalid" and -2 (max - 1) seems to be about the right thing.
>
>
>>
>> So it looks like following solution could fix both problems and be
>> relatively fast:
>>
>> "Resolve all relocations from debug sections into dead code to 1. Parse
>> debug sections and replace HighPc of an address range pointing to dead code
>> and specified as length to 1".
>>
>
> That second part seems pretty expensive compared to anything else the
> linker is doing with debug info. I'd try to avoid it if at all possible.
>
>
>> As the result all address ranges pointing into dead code would be marked
>> as zero length.
>>
>> There still exist another problem:
>>
>> DWARF4: "A range list entry (but not a base address selection or end of
>> list entry) whose beginning and
>> ending addresses are equal has no effect because the size of the range
>> covered by such an
>> entry is zero."
>>
>> DWARF5: "A bounded range entry whose beginning and ending address offsets
>> are equal
>> (including zero) indicates an empty range and may be ignored."
>>
>> These rules allow us to ignore zero-length address ranges. I.e., some
>> tool reading DWARF is permitted to ignore related DWARF entries.
>>
>
> I agree it allows consumers to ignore that entry in the range list because
> that entry is zero-length/equivalent to not being present at all - I don't
> think that means consumers can ignore the DIE that refers to this range
> list. I think it's valid DWARF to have a CU that only describes types,
> without any code attached to it at all. Or for a subprogram that's been
> eliminated to still be used by a consumer for name lookup purposes - so the
> consumer can tell the user there is a function called "f1" and tell the
> user what parameter types, return type it has, etc - not ignore it entirely.
>
>
>> In that case, there could be ignored essential descriptions. That problem
>> could happen with -flto=thin example
>> https://reviews.llvm.org/D54747#1503720 . In this example, all type
>> definitions except one were replaced with declarations by thinlto. The
>> definition, which was left, is in a piece of debug info related to deleted
>> code. According to zero-length rule, that definition could be ignored, and
>> finally, incomplete debug info could be used.
>>
>
> Yeah, I think the bug there is the linker dropping object files just
> because they have no exxecutable code in them - I think the patch that did
> that was reverted, if I'm remembering correctly.
>
>
>>
>> So, it probably should be forbidden to generate debug_info, which could
>> become incomplete after removing pieces related to zero length address
>> ranges. Otherwise, creating zero-length address ranges could lead to
>> incomplete debug info.
>>
>> Thank you, Alexey.
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200528/6e94e509/attachment.html>


More information about the llvm-dev mailing list