[lldb-dev] Inconsistencies in CIE pointer in FDEs in .debug_frame

Martin Storsjö via lldb-dev lldb-dev at lists.llvm.org
Mon Nov 25 01:46:05 PST 2019


On Mon, 25 Nov 2019, Pavel Labath wrote:

> On 24/11/2019 23:16, Martin Storsjö via lldb-dev wrote:
>> Hi,
>> 
>> I'm looking into something that seems like an inconsistency in handling of 
>> the CIE pointer in FDEs in .debug_frame, between how debug info is 
>> generated in LLVM and consumed in LLDB.
>> 
>> For FDEs in .eh_frame, the CIE pointer/cie_id field is interpreted as an 
>> offset from the current FDE - this seems to be consistent.
>> 
>> But for cases in .debug_frame, they are treated differently. In LLDB, the 
>> cie_id field is assumed to be relative to the begin of the .debug_frame 
>> section: 
>> https://github.com/llvm/llvm-project/blob/master/lldb/source/Symbol/DWARFCallFrameInfo.cpp#L482-L495 
>> 
>> However, when this field is produced in LLVM, it can, depending on 
>> MCAsmInfo flags, end up written as a plain absolute address to the CIE: 
>> https://github.com/llvm/llvm-project/blob/master/llvm/lib/MC/MCDwarf.cpp#L1699-L1705 
>> 
>> That code in MCDwarf.cpp hasn't been touched in many years, so I would 
>> expect that the info it generates actually has been used since and been 
>> found to be correct. Or are most cases built with -funwind-tables or 
>> similar, enabled by default?, so this is exercised in untested cases?
>> 
>> In the case where I'm running in this, LLDB reports "error: Invalid cie 
>> offset" when running executables with such .debug_frame sections.
>> 
>> By adding an ", true" to the end of the EmitSymbolValue call in 
>> MCDwarf.cpp, the symbol reference is made section relative and the code 
>> seems to do what LLDB expects. Is that correct, or should LLDB learn the 
>> cases (which?) where the cie_id is an absolute address instead of a section 
>> relative one?
>> 
>> // Martin
>
> What's the target you're encountering this behavior on? Can you maybe provide 
> a short example of how the CIE/FDE entries in question look like?

I'm seeing this behaviour for mingw targets. GCC produces debug_frame 
sections where the CIE pointer is a section relative address (with a 
SECTREL relocation), while LLVM produces debug_frame sections with 
absolute (global/virtual) addresses.

LLDB seems to expect the format that GCC produces here.

> I could be wrong (I'm not really an expert on this), but my understanding is 
> that "asmInfo->doesDwarfUseRelocationsAcrossSections()" is basically 
> equivalent to "is target MachO"

Yes, that's pretty much my take of it as well. The BPF target also has an 
option for setting this flag in asminfo, but other than that, it's not 
modified.

> That said, if that is all there is here, then it does not seem to me like 
> there's any special support in lldb needed, as the cie offset will always be 
> a correct absolute offset from the start of the section by the time lldb gets 
> to see it (and so it shouldn't matter if the offset was put there by the 
> compiler or the linker). This makes me think that I am missing something, but 
> I have no idea what could that be..

This wasn't the inconsistency I'm looking into.

I'm looking into an inconsistency between section relative and absolute 
addresses. The default case in MCDwarf.cpp, calls 
EmitSymbolValue(&cieStart, 4).

By default EmitSymbolValue emits _absolute_ addresses (or more precisely, 
relocations that makes the linker produce absolute addresses), i.e. the 
full address of the CIE, instead of section relative.

The EmitSymbolValue function, declared at 
https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/MC/MCStreamer.h#L669-L670, 
takes an IsSectionRelative parameter, which defaults to false here (as it 
isn't specified). I would expect that it should be true, as LLDB expects a 
section relative address here.

I think this is a bug in LLVM's MCDwarf.cpp, but it puzzles me how it can 
have gone unnoticed.

But now I tested this a bit more with ELF setups, and realized that it 
somehow does seem to do the right thing. It might have something to do 
with how ELF linkers handle this kind of section that isn't loaded at 
runtime (and thus perhaps doesn't really have a virtual address assigned).

So that pretty much clears the question regarding inconsistency, and 
raises more questions about how this really works in ELF and MCDwarf.


A test procedure that shows off the issue is this:

$ cat test.c
void entry(void) { }

$ bin/clang -fno-unwind-tables test.c -c -g -o test.o -target i686-linux-gnu
$ bin/llvm-objdump -r test.o

test.o: file format ELF32-i386

<redacted>

RELOCATION RECORDS FOR [.debug_frame]:
00000018 R_386_32 .debug_frame
0000001c R_386_32 .text

# As far as I know, these two R_386_32 relocations both indicate that the
# full, absolute address of these two locations should be inserted in
# these two locations.

$ bin/ld.lld test.o -o exe -e entry
$ bin/llvm-dwarfdump --eh-frame exe

exe:    file format ELF32-i386

.debug_frame contents:

00000000 00000010 ffffffff CIE
<redacted for brevity>

00000014 00000018 00000000 FDE cie=00000000 pc=004010c0...004010c5
                   ^
# The CIE offset, the third field, is set as zero (the offset where the
# CIE starts, even though the relocation indicated absolute address),
# but the R_386_32 for the .text address gave a correct absolute pc range.



Now if I repeat the same steps but for a mingw target, this ends up 
different:

$ bin/clang -fno-unwind-tables test.c -c -g -o test.o -target i686-mingw32
$ bin/llvm-objdump -r test.o

test.o: file format COFF-i386

<redacted>

RELOCATION RECORDS FOR [.debug_frame]:
00000018 IMAGE_REL_I386_DIR32 .debug_frame
0000001c IMAGE_REL_I386_DIR32 .text

# Same thing here, absolute addresses for .debug_frame and .text

$ bin/lld-link test.o -out:exe -entry:entry -subsystem:console -debug:dwarf
$ bin/llvm-dwarfdump --eh-frame exe
exe:    file format COFF-i386

.debug_frame contents:

00000000 00000010 ffffffff CIE
<redacted>

00000014 00000014 00404000 FDE cie=00404000 pc=00401000...00401005
                   ^
# Here the CIE offset, the third column, ended up as an absolute address,
# 0x00404000, which LLDB rejects.




So, if I make the call to EmitSymbolValue() set the IsSectionRelative 
parameter to true, I get the correct, expected relocations for this 
section:

RELOCATION RECORDS FOR [.debug_frame]:
00000018 IMAGE_REL_I386_SECREL .debug_frame
0000001c IMAGE_REL_I386_DIR32 .text

This matches what GCC produces in similar cases as well.

But with this in place, ELF targets misbehave severely; there's no 
relocation produced at all for the .debug_frame symbol, and the second 
relocation gets written at the wrong offset.

In any case, it's clearly only an LLVM/MC issue, and no issue with LLDB.

// Martin


More information about the lldb-dev mailing list