[llvm-dev] Range lists, zero-length functions, linker gc

Alexey Lapshin via llvm-dev llvm-dev at lists.llvm.org
Fri May 29 06:31:11 PDT 2020


But, this would not completely solve the problem from https://reviews.llvm.org/D59553 - Overlapped address ranges. Binutils approach will solve the problem if the address range specified as start_address:end_address. While resolving relocations, it would replace such a range with 1:1.

>> However, It would not work if address ranges were specified as start_address:length since the
>> length is not relocated.
>>This case could be additionally fixed by fast scan debug_info for High_PC defined as length
>>and changing it to 1. Something which you suggested
>>here: http://lists.llvm.org/pipermail/llvm-dev/2020-May/141599.html.

> Hmm, I don't /think/ I intended to suggest anything that would have to parse all the debug_info,
> even if just to fixup high_pc. I meant that debug_rnglist for the CU at least (rnglist has fewer
> problems - you can't accidentally terminate it early, but still has the "large functions in programs
> that use relatively low code addresses can't just be resolved to "addend" because then [0, length)
> of the large function might overlap into that code address range") could be modified by a
> DWARF-aware linker to remove the unused chunks.

right. you did not.
that is my suggestion to extend that idea - not only fix debug_rnglist
but all other occurrences of HighPC.

>The DWARF that describes a specific function
> using low_pc/high_pc - it may be split into a .dwo file and unreachable by the linker - so it /needs/ a
> magic value for the address referenced by the low_pc to indicate that it is invalid.

for the split-dwarf: solution which updates HighPC should patch .dwo files also.

> Which all comes back to "we probably need to pick a value that's explicitly invalid" and -2 (max - 1)
> seems to be about the right thing.

>>So it looks like following solution could fix both problems and be relatively fast:
>>"Resolve all relocations from debug sections into dead code to 1. Parse debug sections
>> and replace HighPc of an address range pointing to dead code and specified as length to 1".

> That second part seems pretty expensive compared to anything else the linker is doing with
> debug info. I'd try to avoid it if at all possible.

Agreed with that. Though there are some concerns about -2 which could be essential or not:

I do not know real problems caused by using UINT64_MAX-1 for address ranges
pointing to deleted code. Moreover, while testing https://reviews.llvm.org/D59553
I noticed that the tools become work better: lldb, llvm-symbolizer, gnu addr2line,
gnu objdump. They report code location correctly with the patch and incorrectly
without the patch.

But there is a corner case: address range is specified as start_address:length.
After replacing start_address with -2, LowPC becomes higher than HighPC.
>From the point of DWARF standard - this is "undefined behavior". The standard
says nothing about that case. Different tools could interpret it differently.
Some tools could assume that such a situation is not possible and crash if it occurs.
Some could ignore it. Others could report an error and stop working.
f.e. llvm-dwarfdump --verify reports error and continue to work.

llvm-dwarfdump --verify :
error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)

So after implementing this, some tools could potentially stop working.
I do not know, such tools. So, I am not sure whether that is the problem.

Additionally, It is necessary to document that behavior in DWARF standard to avoid problems
in the future(same as for zero length address ranges):

"A bounded range entry whose beginning address offset greater than ending address offset
indicates an invalid range and may be ignored. "

Note, that this does not specify an additional magic value(UINT64_MAX-1).
Instead, it describes general situation(LowPC>HighPC).

If backward compatibility is not a problem - then using LowPC>HighPC to indicate invalid
address range pointing to deleted code seems to be the fastest solution(which could be
implemented by resolving relocations from debug sections to deleted code to UINT64_MAX-1).

If backward compatibility is a problem - then we could use already standardized
"zero-length address range" to mark address ranges pointing to deleted code.
That solution would require to patch address range length in the dwarf.

Thank you, Alexey
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200529/60230385/attachment.html>


More information about the llvm-dev mailing list