[PATCH] D123469: [DebugInfo][llvm-dwarfutil] Combine overlapped address ranges.

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 21 09:56:52 PDT 2022


dblaikie added a comment.

In D123469#3463313 <https://reviews.llvm.org/D123469#3463313>, @clayborg wrote:

>> Do you have a small/standalone lto example of these overlapped ranges - to understand both why they happen, and why they're invalid?
>
> Just have LTO combine the contents of two functions with the same opcode bytes and you can end up with ranges within the same CU and across CUs that overlap. I don't consider these invalid, but if they are in the same CU, the verification using llvm-dwarfdump would emit errors.

This can/does happen without LTO - actually I think it's less likely to happen /with/ LTO than without, but in any case, here's a simple example with overlapping CU ranges that doesn't require LTO (using `__attribute__((nodebug))` to omit unnecessary debug info, the issue still reproduces without it, just with more noise):
inl.h:

  inline void f1() {
  }
  void a();

a.cpp:

  #include "inl.h"
  __attribute__((nodebug)) void a() {
    f1();
  }

b.cpp:

  #include "inl.h"
  void a();
  __attribute__((nodebug)) int main() {
    a();
    f1();
  }



  clang++ -g a.cpp b.cpp && llvm-dwarfdump a.out | grep "0x000000000\|DW_TAG\|DW_AT_name" | sed -e "s/^............//"`

  DW_TAG_compile_unit
    DW_AT_name    ("a.cpp")
    DW_AT_low_pc  (0x0000000000401180)
    DW_AT_high_pc (0x0000000000401186)
    DW_TAG_subprogram
      DW_AT_low_pc        (0x0000000000401180)
      DW_AT_high_pc       (0x0000000000401186)
      DW_AT_name  ("f1")
  DW_TAG_compile_unit
    DW_AT_name    ("b.cpp")
    DW_AT_low_pc  (0x0000000000401180)
    DW_AT_high_pc (0x0000000000401186)
    DW_TAG_subprogram
      DW_AT_low_pc        (0x0000000000401180)
      DW_AT_high_pc       (0x0000000000401186)
      DW_AT_name  ("f1")

(this doesn't always happen - only if the two copies of `f1` have identical instruction sequences do they end up sharing - if they are optimized differently then whichever one is chosen gets the range, and the other one gets a zero/tombstone address - like a gc'd function definition (otherwise the DWARF would be bogus - could be describing the instruction stream totally incorrectly, etc))

I'd expect this to also happen with Identical Code Folding, but at least a simple case doesn't seem to reproduce that behavior:

  void f1() { }
  void f2() { }
  __attribute__((nodebug)) int main() {
  }

  clang++ -O3 test.cpp -fno-addrsig -fuse-ld=lld -Wl,--icf=all -g -ffunction-sections &&  llvm-dwarfdump a.out | grep "0x000000000\|DW_TAG\|DW_AT_name" | sed -e "s/^............//"`

  DW_TAG_compile_unit
    DW_AT_name    ("test.cpp")
    DW_AT_low_pc  (0x0000000000000000)
       [0x0000000000201720, 0x0000000000201721)
       [0x0000000000000000, 0x0000000000000001))
    DW_TAG_subprogram
      DW_AT_low_pc        (0x0000000000201720)
      DW_AT_high_pc       (0x0000000000201721)
      DW_AT_name  ("f1")
    DW_TAG_subprogram
      DW_AT_low_pc        (0x0000000000000000)
      DW_AT_high_pc       (0x0000000000000001)
      DW_AT_name  ("f2")

Actually LLVM's LTO seems least likely to produce overlapping ranges - its IR representation doesn't map one function to multiple DWARF subprograms (the mapping from IR function to subprogram is from the IR function to a single DISubprogram - it can't represent more than one subprogram associated with a given IR function). Though possibly GCC's LTO or the like might have different behavior in this regard.

> I have also seen bad DWARF where you  have two functions that don't share the exact same range and yet do overlap. This is mostly again in LTO binaries where the LTO linker tried to change the DWARF.

I'd be curious to hear more about that/see a reproduction if you've got one


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123469/new/

https://reviews.llvm.org/D123469



More information about the llvm-commits mailing list