[llvm-bugs] [Bug 43290] New: [DWARF] Padding between location lists confuses list-reader

via llvm-bugs llvm-bugs at lists.llvm.org
Thu Sep 12 05:00:13 PDT 2019


https://bugs.llvm.org/show_bug.cgi?id=43290

            Bug ID: 43290
           Summary: [DWARF] Padding between location lists confuses
                    list-reader
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Keywords: wrong-debug
          Severity: normal
          Priority: P
         Component: DebugInfo
          Assignee: unassignedbugs at nondot.org
          Reporter: jeremy.morse.llvm at gmail.com
                CC: dblaikie at gmail.com, jdevlieghere at apple.com,
                    keith.walker at arm.com, llvm-bugs at lists.llvm.org,
                    paul_robinson at playstation.sony.com

It seems that LLVMs location-list reader chokes when there's padding in between
lists, which is what GNU LD produces by default when it deletes a COMDAT
function. This situation can be replicated with the three files at the bottom
of this ticket on a fresh ubuntu 19.04 VM, building them with:

  gcc test1.cpp -o test1.o -g -O1 -fno-inline -c -gdwarf-4 -gstrict-dwarf
  gcc test2.cpp -o test2.o -g -O1 -fno-inline -c -gdwarf-4 -gstrict-dwarf
  gcc test3.cpp -o test3.o -g -O1 -fno-inline -c -gdwarf-4 -gstrict-dwarf
  gcc test1.o test2.o test3.o -o a.out -gstrict-dwarf

The code in the test files is meaningless, but crafted to:
 * generate some location lists,
 * which are in a COMDAT / Weak function, specifically "a_method",
 * that get de-duplicated at link time,
 * and have more location lists after a_method's (test3.cpp).

When linking this, GNU LD (the default on linux) appears to keep the
location-lists from the duplicate function in the .debug_loc section, it just
nulls the addresses out. Using `readelf --debug-dump=loc a.out` and trimming
the start and end:

--------8<--------
    00000087 00000000000011dd 00000000000011f0 (DW_OP_reg5 (rdi))
    0000009a <End of list>
    000000aa <End of list>
readelf: Warning: There is a hole [0xba - 0xe8] in .debug_loc section.
    000000e8 00000000000011f7 000000000000120a (DW_OP_reg5 (rdi))
-------->8--------

The "hole" seems to be a problem for llvm-dwarfdump, I'm using LLVM-8 on the VM
but this replicates with trunk. If I run `llvm-dwarfdump-8 -debug-loc a.out` I
get firstly an error message:

    error: location list overflows the debug_loc section.
    error: failed to consume entire .debug_loc section

And then after the gap some nonsense location lists:

--------8<--------
0x00000087: 
            [0x00000000000011dd,  0x00000000000011f0): DW_OP_reg5 RDI

0x000000aa: 

0x000000ba: 
            [0x0000000000540001,  0x0000000000000000): 
            [0x1414330074000900,  0x000000009f1c1e1b): 

0x000000ee: 
            [0x0000000000130000,  0x0000135500010000): 
            [0x000000001c000000,  0x001c510001000000): 
            [0x0000002000000000,  0x2053000100000000): 
            [0x0000210000000000,  0x5000010000000000): 
-------->8--------

Happily if one runs `llvm-dwarfdump-8 a.out --name=b`, location lists are read
from past the gap without error, so looking up a location list directly still
works.

What breaks however is the --statistics option to llvm-dwarfdump, which I've
been getting weird numbers out of for a while. It looks up [0] location lists
via offset through this [1] API call, which appears to pre-read all location
lists and trips over the gap. When fed a binary such as the above, I get the
error message, and getLocationListAtOffset fails for some DW_AT_locations.
These then get interpreted as a location list fully covering all scope bytes.

~

I'm not hugely familiar with DWARF, but I nerd-sniped PaulR by asking him about
this, and he seemed to reckon there's nothing in the spec that prohibits
padding between location lists.

One solution would be "don't link things with LD", but it is still the default
in many places.

Neither GOLD nor LLD leave a gap in .debug_loc in this situation.

[0]
https://github.com/llvm/llvm-project/blob/c714a88a4dc4dadc16409986a7e275b86142622b/llvm/tools/llvm-dwarfdump/Statistics.cpp#L251
[1]
https://github.com/llvm/llvm-project/blob/88b4e28a679a5aaa14ef41a1901d3d24ddd8946b/llvm/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp#L202

Code to replicate this situation below. I think the namespace might be
un-necessary.

test1.cpp
--------8<--------
#include "tmp.h"

int main() {
  thin::floogie foo;
  somefunc(foo);
  return foo.a_method(3);
}
-------->8--------

test2.cp
--------8<--------
#include "tmp.h"

void somefunc(thin::floogie &foo) {
  foo.somemember = 12;
  foo.a_method(foo.somemember);
}

void externfunc() { }
-------->8--------

test3.cpp
--------8<--------
#include <stdio.h>
int unrelated(int a, int b) { printf("%d, %d\n", a, b); return a; };
-------->8--------

tmp.h
--------8<--------
void externfunc();

namespace thin {
  class floogie {
    public:
      floogie() : somemember(0) { }
      int somemember;
      bool a_method(int another) {
        another += 13;
        externfunc();
        another %= 3;
        somemember += 3;
        return somemember + another;
      }
  };
}

void somefunc(thin::floogie &bar);
-------->8--------

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190912/ab8ad93f/attachment-0001.html>


More information about the llvm-bugs mailing list