[lldb-dev] Inconsistencies in CIE pointer in FDEs in .debug_frame
Pavel Labath via lldb-dev
lldb-dev at lists.llvm.org
Mon Nov 25 02:11:23 PST 2019
On 25/11/2019 10:46, Martin Storsjö wrote:
> On Mon, 25 Nov 2019, Pavel Labath wrote:
>
>> On 24/11/2019 23:16, Martin Storsjö via lldb-dev wrote:
>>> Hi,
>>>
>>> I'm looking into something that seems like an inconsistency in
>>> handling of the CIE pointer in FDEs in .debug_frame, between how
>>> debug info is generated in LLVM and consumed in LLDB.
>>>
>>> For FDEs in .eh_frame, the CIE pointer/cie_id field is interpreted as
>>> an offset from the current FDE - this seems to be consistent.
>>>
>>> But for cases in .debug_frame, they are treated differently. In LLDB,
>>> the cie_id field is assumed to be relative to the begin of the
>>> .debug_frame section:
>>> https://github.com/llvm/llvm-project/blob/master/lldb/source/Symbol/DWARFCallFrameInfo.cpp#L482-L495
>>>
>>> However, when this field is produced in LLVM, it can, depending on
>>> MCAsmInfo flags, end up written as a plain absolute address to the
>>> CIE:
>>> https://github.com/llvm/llvm-project/blob/master/llvm/lib/MC/MCDwarf.cpp#L1699-L1705
>>>
>>> That code in MCDwarf.cpp hasn't been touched in many years, so I
>>> would expect that the info it generates actually has been used since
>>> and been found to be correct. Or are most cases built with
>>> -funwind-tables or similar, enabled by default?, so this is exercised
>>> in untested cases?
>>>
>>> In the case where I'm running in this, LLDB reports "error: Invalid
>>> cie offset" when running executables with such .debug_frame sections.
>>>
>>> By adding an ", true" to the end of the EmitSymbolValue call in
>>> MCDwarf.cpp, the symbol reference is made section relative and the
>>> code seems to do what LLDB expects. Is that correct, or should LLDB
>>> learn the cases (which?) where the cie_id is an absolute address
>>> instead of a section relative one?
>>>
>>> // Martin
>>
>> What's the target you're encountering this behavior on? Can you maybe
>> provide a short example of how the CIE/FDE entries in question look like?
>
> I'm seeing this behaviour for mingw targets. GCC produces debug_frame
> sections where the CIE pointer is a section relative address (with a
> SECTREL relocation), while LLVM produces debug_frame sections with
> absolute (global/virtual) addresses.
Right. That's the part I was missing. Thanks.
>
> LLDB seems to expect the format that GCC produces here.
>
>> I could be wrong (I'm not really an expert on this), but my
>> understanding is that
>> "asmInfo->doesDwarfUseRelocationsAcrossSections()" is basically
>> equivalent to "is target MachO"
>
> Yes, that's pretty much my take of it as well. The BPF target also has
> an option for setting this flag in asminfo, but other than that, it's
> not modified >
>> That said, if that is all there is here, then it does not seem to me
>> like there's any special support in lldb needed, as the cie offset
>> will always be a correct absolute offset from the start of the section
>> by the time lldb gets to see it (and so it shouldn't matter if the
>> offset was put there by the compiler or the linker). This makes me
>> think that I am missing something, but I have no idea what could that
>> be..
>
> This wasn't the inconsistency I'm looking into.
>
> I'm looking into an inconsistency between section relative and absolute
> addresses. The default case in MCDwarf.cpp, calls
> EmitSymbolValue(&cieStart, 4).
>
> By default EmitSymbolValue emits _absolute_ addresses (or more
> precisely, relocations that makes the linker produce absolute
> addresses), i.e. the full address of the CIE, instead of section relative.
>
> The EmitSymbolValue function, declared at
> https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/MC/MCStreamer.h#L669-L670,
> takes an IsSectionRelative parameter, which defaults to false here (as
> it isn't specified). I would expect that it should be true, as LLDB
> expects a section relative address here.
>
> I think this is a bug in LLVM's MCDwarf.cpp, but it puzzles me how it
> can have gone unnoticed.
>
> But now I tested this a bit more with ELF setups, and realized that it
> somehow does seem to do the right thing. It might have something to do
> with how ELF linkers handle this kind of section that isn't loaded at
> runtime (and thus perhaps doesn't really have a virtual address assigned).
>
> So that pretty much clears the question regarding inconsistency, and
> raises more questions about how this really works in ELF and MCDwarf.
>
>
> A test procedure that shows off the issue is this:
>
> $ cat test.c
> void entry(void) { }
>
> $ bin/clang -fno-unwind-tables test.c -c -g -o test.o -target
> i686-linux-gnu
> $ bin/llvm-objdump -r test.o
>
> test.o: file format ELF32-i386
>
> <redacted>
>
> RELOCATION RECORDS FOR [.debug_frame]:
> 00000018 R_386_32 .debug_frame
> 0000001c R_386_32 .text
>
> # As far as I know, these two R_386_32 relocations both indicate that the
> # full, absolute address of these two locations should be inserted in
> # these two locations.
>
> $ bin/ld.lld test.o -o exe -e entry
> $ bin/llvm-dwarfdump --eh-frame exe
>
> exe: file format ELF32-i386
>
> .debug_frame contents:
>
> 00000000 00000010 ffffffff CIE
> <redacted for brevity>
>
> 00000014 00000018 00000000 FDE cie=00000000 pc=004010c0...004010c5
> ^
> # The CIE offset, the third field, is set as zero (the offset where the
> # CIE starts, even though the relocation indicated absolute address),
> # but the R_386_32 for the .text address gave a correct absolute pc range.
>
>
>
> Now if I repeat the same steps but for a mingw target, this ends up
> different:
>
> $ bin/clang -fno-unwind-tables test.c -c -g -o test.o -target i686-mingw32
> $ bin/llvm-objdump -r test.o
>
> test.o: file format COFF-i386
>
> <redacted>
>
> RELOCATION RECORDS FOR [.debug_frame]:
> 00000018 IMAGE_REL_I386_DIR32 .debug_frame
> 0000001c IMAGE_REL_I386_DIR32 .text
>
> # Same thing here, absolute addresses for .debug_frame and .text
>
> $ bin/lld-link test.o -out:exe -entry:entry -subsystem:console -debug:dwarf
> $ bin/llvm-dwarfdump --eh-frame exe
> exe: file format COFF-i386
>
> .debug_frame contents:
>
> 00000000 00000010 ffffffff CIE
> <redacted>
>
> 00000014 00000014 00404000 FDE cie=00404000 pc=00401000...00401005
> ^
> # Here the CIE offset, the third column, ended up as an absolute address,
> # 0x00404000, which LLDB rejects.
>
>
>
>
> So, if I make the call to EmitSymbolValue() set the IsSectionRelative
> parameter to true, I get the correct, expected relocations for this
> section:
>
> RELOCATION RECORDS FOR [.debug_frame]:
> 00000018 IMAGE_REL_I386_SECREL .debug_frame
> 0000001c IMAGE_REL_I386_DIR32 .text
>
> This matches what GCC produces in similar cases as well.
>
> But with this in place, ELF targets misbehave severely; there's no
> relocation produced at all for the .debug_frame symbol, and the second
> relocation gets written at the wrong offset.
>
> In any case, it's clearly only an LLVM/MC issue, and no issue with LLDB.
>
Thanks for the detailed explanation.
So, what elf linkers do is that they link non-loadable (SHF_ALLOC)
sections as if they were loaded at address zero. I think it's possible
to change that via a linker script, but I think doing that would cause
pretty much everything to blow up.
This means that the whole absolute vs. section-relative inconsistency is
irrelevant there (and I would expect the elf folks would not even
consider that a inconsistency/bug).
In any case, I agree with your assessment that this is an llvm/mc bug,
and so we'll probably need to open this issue on llvm-dev. I guess the
reason that this wasn't discovered is because llvm tools (and lldb in
particular) are not so widely used/tested on windows. In might be
interesting to see what happens if you feed the llvm generated file to
gdb, or maybe link it with the gnu linker...
pl
More information about the lldb-dev
mailing list