[PATCH] D59672: [pdb] Add -type-stats and sort stats by descending size

Reid Kleckner via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 22 13:24:07 PDT 2019


rnk marked 3 inline comments as done.
rnk added a comment.

I happen to know that @zturner is busy from today until next Thursday, so I'm going to go ahead and land this with some tweaks. It's the dumper, so I think post commit review is fine if he has any suggestions for improving this or simplifying the code.

In D59672#1439442 <https://reviews.llvm.org/D59672#1439442>, @aganea wrote:

> This is what I'm seeing for a large PDB (2 GB). Things are a bit different from your use-case:
>
>                        Type Record Stats
>   ============================================================
>  
>     Types
>              Total: 7050382 entries ( 536,798,320 bytes,   76.14 avg)
>     --------------------------------------------------------------------------
>       LF_FIELDLIST:  369709 entries ( 116,277,336 bytes,  314.51 avg)
>


That's interesting, it's more consistent with what I expected to find. I think this indicates that your codebase has templates with more members, and therefore longer field (and method) lists. I was surprised that the LF_CLASS and LF_STRUCTURE records dominated browser_tests.exe.pdb.

>   LF_VFTABLE:   48428 entries ( 106,221,244 bytes, 2193.38 avg)
>    

This, is an interesting result. LLVM can't produce these records, only MSVC can. I asked Dave Bartolomeo and YongKang about their purpose, and they told me that they are used for devirtualization at LTCG time. Given that we don't use them, I bet we can just discard them from the PDB by default, with a flag to explicitly request them if desired. These records include long mangled names of all virtual methods, so it adds up.

>   LF_STRUCTURE:  502958 entries (  92,283,348 bytes,  183.48 avg)
>   LF_MFUNCTION: 2699622 entries (  75,589,416 bytes,   28.00 avg)
>    

I guess this LF_MFUNCTION result is consistent with long LF_FIELDLISTS: it probably indicates that many classes with many methods are repeatedly instantiated many times with varying parameters across the codebase.

>        LF_CLASS:  323609 entries (  72,441,564 bytes,  223.86 avg)
>         LF_ENUM:  122798 entries (  24,322,364 bytes,  198.07 avg)
>      LF_POINTER: 1262870 entries (  15,468,832 bytes,   12.25 avg)
>      LF_ARGLIST:  794482 entries (  12,368,496 bytes,   15.57 avg)
>   LF_METHODLIST:  462337 entries (  11,367,124 bytes,   24.59 avg)
>        LF_UNION:   29152 entries (   4,260,840 bytes,  146.16 avg)
>    LF_PROCEDURE:  236117 entries (   3,777,872 bytes,   16.00 avg)
>     LF_MODIFIER:  189660 entries (   2,275,920 bytes,   12.00 avg)
>        LF_ARRAY:    7766 entries (     124,912 bytes,   16.08 avg)
>      LF_VTSHAPE:     175 entries (      10,668 bytes,   60.96 avg)
>     LF_BITFIELD:     698 entries (       8,376 bytes,   12.00 avg)
>        LF_LABEL:       1 entries (           8 bytes,    8.00 avg)
>    





================
Comment at: llvm/test/DebugInfo/PDB/udt-stats.test:12
 CHECK-NEXT:      LF_PROCEDURE |     1     16
-CHECK-NEXT:      LF_STRUCTURE |    27  1,788
+CHECK-NEXT:     <simple type> |    43      0
 CHECK-NEXT:     -----------------------------
----------------
aganea wrote:
> Why is <simple type> zero size?
I think this is the size of the type record being referenced, not the S_UDT record. In the case of simple types, there are no type records. This would correspond to something like `typedef void *voidptr_t;`.


================
Comment at: llvm/tools/llvm-pdbutil/DumpOutputStyle.cpp:339
     std::string KindName = formatModuleDetailKind(Kind(K.first));
-    P.formatLine("{0,40}: {1,7} entries ({2,8} bytes)", KindName,
+    P.formatLine("{0,40}: {1,7} entries ({2,8:N} bytes)", KindName,
                  K.second.Count, K.second.Size);
----------------
aganea wrote:
> Would you mind increasing {1,7} to {1,8} and {2,8:N} to {2,10:N} please? The output is offsetted on my end:
> ```
>                                   S_UNAMESPACE:   45458 entries (  904548 bytes)
>                                     S_REGREL32: 4138872 entries (115147908 bytes)
>                                        S_LOCAL: 2487396 entries (57712788 bytes)
> ```
Definitely, I actually went up to {2,12:N} for types, which is shown down below.


================
Comment at: llvm/tools/llvm-pdbutil/DumpOutputStyle.cpp:702
+  StatCollection TypeStats;
+  auto &Types = File.types();
+  for (Optional<TypeIndex> TI = Types.getFirst(); TI; TI = Types.getNext(*TI)) {
----------------
aganea wrote:
> Rui says no `auto` in LLD (when the type isn't obvious), is that a policy that should apply everywhere in LLVM?
Yeah, I'll remove this. @zturner wrote a lot of this code, and I think he has a different attitude towards auto usage.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59672/new/

https://reviews.llvm.org/D59672





More information about the llvm-commits mailing list