[PATCH] D124082: [Debuginfo][llvm-dwarfdump][dsymutil] Add dsymutil compatibility dump.

Alexey Lapshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Apr 23 08:14:51 PDT 2022


avl added a comment.

> Instead of doing a dumping thing, it might be a good idea to have a comparison mode build into llvm-dsymutil that can compare the contents of two files maybe?

Agreed, creating some smarter tool(doing complex comparisons of DIEs trees) makes sense. Probably it should not be llvm-dsymutil. It might be llvm-dwarfdump or some separate tool.

Though, I think the dump suggested in this patch is still quite useful, especially for type de-duplication optimization. Please inspect the example of the usage of this dump in this message later.

> As David Blaikie suggested, this output format won't help diff things as we enable more complex garbage collection and type uniquing. If we do want a dump format that can be diffed, it should probably be output in a format that sorts things a bit better. Like sorting all functions by address and making a common output format for them. Types would be output sorted by type name, but it would be harder to diff especially since the other llvm-dsymutil did more simple type uniquing and the new modifications you are trying to get in really merge all of the types.

If we think that adding any sorting will make this patch better - I am happy to add it.

Though, even without such sorting, this patch is quite powerful. The first thing is that it completely does not depend on the type(DW_TAG_*_type and other) dies. Type dies could be in any order or duplicated any number of times. Only names are printed for types(no need to sort type dies). The second thing is that dsymutil does not change the order of DW_TAG_subprogram dies. Thus, we can have an effective comparison without sorting DW_TAG_subprogram dies. We will need to sort them if there are some other tools that change order of DW_TAG_subprogram dies.

This patch also works good when types de-duplication is done in completely different way. 
With some small modifications for declarations it helps to compare output of current dsymutil and output of dsymutil from the patch D96035 <https://reviews.llvm.org/D96035>(even if it works in non-deterministic mode).

> There is already a "--diff" option to this tool that omits all addresses and offsets to help diff two files so chunks that were removed can easily be identified, does that mode not work for you?

No, it does not. f.e. if type deduplication is done then the final DWARF files(with ODR and without ODR) will contain different number of type dies, even if they are semantically the same. Thus dumps created with "--diff" option will not match. Someone need to inspect differences(which might be quite complex) to understand whether they are semantically equal. At the same time dumps created with "--dsymutil-compat-dump" option will be 100% equal. Let`s see the example:

  dsymutil llvm-strings
  
  llvm-dwarfdump --debug-info --diff llvm-strings.dSYM/Contents/Resources/DWARF/llvm-strings > llvm-strings-diff-dump
  llvm-dwarfdump --dsymutil-compat-dump llvm-strings.dSYM/Contents/Resources/DWARF/llvm-strings > llvm-strings-compat-dump
  
  dsymutil --no-odr llvm-strings
  
  llvm-dwarfdump --debug-info --diff llvm-strings.dSYM/Contents/Resources/DWARF/llvm-strings > llvm-strings-noodr-diff-dump
  llvm-dwarfdump --dsymutil-compat-dump llvm-strings.dSYM/Contents/Resources/DWARF/llvm-strings > llvm-strings-noodr-compat-dump
  
  
  diff llvm-strings-diff-dump llvm-strings-noodr-diff-dump | wc -c | awk '{print $1/1000"K"}'
  68563,8K
  
  diff llvm-strings-compat-dump llvm-strings-noodr-compat-dump | wc -c | awk '{print $1/1000"K"}'
  0K

One of the secret why dsymutil-compat-dump does not have a lot of differencies is that it skips much.
Probably we can make it more detailed. But even in this form it is able to catch errors. f.e. it shows
that dsymutil applied for "opt" binary has a bug inside odr deduplication optimization:

  diff opt-compat-dump opt-noodr-compat-dump 
  12158120c12158120
  <      DW_AT_specification "operator()": "llvm::VPBasicBlock *"
  ---
  >      DW_AT_specification "operator()": "const llvm::VPBasicBlock *"

The "const" qualifier is lost.

Thus, In short, a powerful tool comparing type die trees would be useful. But it would also be much more complex. While this dump is quite simple and can help to find some classes of errors.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124082/new/

https://reviews.llvm.org/D124082



More information about the llvm-commits mailing list