<div dir="ltr">I didn't believe you at first that DIA SDK didn't support partial PDBs, so I went and tried `llvm-pdbdump pretty -types foo.pdb` on a partial PDB and it caused llvm-pdbdump to crash. When I looked further, it turns out IDiaSymbol::findChildren() is returning E_NOTIMPL. Wow! I'm a bit surprised honestly.<div><br></div><div>I've pushed a fix for this in r304982, but all that does is make llvm-pdbdump not crash. It still doesn't display any types.</div><div><br></div><div>Luckily llvm-pdbdump has another mode (accessible via the `raw` subcommand) that can bypass the DIA SDK and show you the underlying structure. Here's what I get when I try dumping types of a partial PDB.</div><div><br></div><div><div>D:\src\llvmbuild\ninja>bin\llvm-pdbdump raw -tpi-records cpptest.pdb</div><div>Type Info Stream (TPI) {</div><div> TPI Version: 20040203</div><div> Record count: 0</div><div> Records [</div><div> TypeIndexOffsets [</div><div> ]</div><div> ]</div><div>}</div></div><div><br></div><div>Umm, ok. So there's *actually* no types in the PDB.</div><div><br></div><div>Let's try symbols.</div><div><br></div><div><div>D:\src\llvmbuild\ninja>bin\llvm-pdbdump raw -module-syms cpptest.pdb</div><div>DBI Stream {</div><div> # snip</div><div> Modules [<br></div><div> {<br></div><div> Name: test2.obj</div><div> # snip</div><div> Symbols [<br></div><div> {</div><div> UnknownSym {</div><div> Kind: 0x1167</div><div> Length: 52</div><div> }</div><div> }</div><div> {</div><div> UnknownSym {</div><div> Kind: 0x1167</div><div> Length: 64</div><div> }</div><div> }</div><div> {</div><div> UnknownSym {</div><div> Kind: 0x1167</div><div> Length: 60</div><div> }</div><div> }</div><div> # thousands of similar lines snipped.</div></div><div><br></div><div>So this is a little bit more interesting. Let's see what these records look like:</div><div><br></div><div><div>D:\src\llvmbuild\ninja>bin\llvm-pdbdump raw -module-syms -sym-record-bytes cpptest.pdb</div><div>DBI Stream {</div><div> # snip</div><div> Modules [<br></div><div> {<br></div><div> Name: test2.obj</div><div> # snip</div><div> Symbols [<br></div><div> {</div><div> UnknownSym {</div><div> Kind: 0x1167</div><div> Length: 52</div><div> }</div><div> Bytes (</div><div> 0000: 30140000 04005F5F 76635F61 74747269 |0.....__vc_attri|</div><div> 0010: 62757465 733A3A65 76656E74 5F736F75 |butes::event_sou|</div><div> 0020: 72636541 74747269 62757465 00000000 |rceAttribute....|</div><div> )</div><div> }</div><div> {</div><div> UnknownSym {</div><div> Kind: 0x1167</div><div> Length: 64</div><div> }</div><div> Bytes (</div><div> 0000: 29140000 04005F5F 76635F61 74747269 |).....__vc_attri|</div><div> 0010: 62757465 733A3A65 76656E74 5F736F75 |butes::event_sou|</div><div> 0020: 72636541 74747269 62757465 3A3A6F70 |rceAttribute::op|</div><div> 0030: 74696D69 7A655F65 00000000 |timize_e....|</div><div> )</div><div> }</div><div> {</div><div> UnknownSym {</div><div> Kind: 0x1167</div><div> Length: 60</div><div> }</div><div> Bytes (</div><div> 0000: 27140000 04005F5F 76635F61 74747269 |'.....__vc_attri|</div><div> 0010: 62757465 733A3A65 76656E74 5F736F75 |butes::event_sou|</div><div> 0020: 72636541 74747269 62757465 3A3A7479 |rceAttribute::ty|</div><div> 0030: 70655F65 00000000 |pe_e....|</div><div> )</div><div> }</div><div> {</div><div> UnknownSym {</div><div> Kind: 0x1167</div><div> Length: 68</div><div> }</div><div> Bytes (</div><div> 0000: 0C140000 04005F5F 76635F61 74747269 |......__vc_attri|</div><div> 0010: 62757465 733A3A68 656C7065 725F6174 |butes::helper_at|</div><div> 0020: 74726962 75746573 3A3A7631 5F616C74 |tributes::v1_alt|</div><div> 0030: 74797065 41747472 69627574 65000000 |typeAttribute...|</div><div> )</div><div> }</div></div><div><br></div><div>So, this symbol record with kind 0x1167 is pretty interesting, and clearly related to /debug:fastlink. Its format can be deduced as something like this:</div><div><br></div><div>struct DebugFastLinkRecord {</div><div> char Unknown[6];</div><div> char Name[0]; // null terminated string</div><div> char Padding[0]; // pad to 4 bytes</div><div>};</div><div><br></div><div>What those first 6 bytes are I can't tell you.</div><div><br></div><div>Let's see what else we can find. another source of interesting debug info comes from what I refer to as "debug subsections". In an object file, every .debug$S section is basically just a big list of these. In a PDB file though, the debug subsections appear embedded inside of a each module's debug stream. Which is similar to a .debug$S section, but with some additional PDB-specific stuff. You can find llvm-pdbdump's code for parsing this in ModuleDebugStream.cpp</div><div><br></div><div>Anyway, the part we're interested can be dumped using llvm-pdbdump raw -subsections=unknown. I say unknown because we're looking for stuff that is unique to /debug:fastlink PDBs, so presumably any /debug:fastlink specific data would be something we don't know about / have never seen before. (Note that this command line option hasn't made it upstream yet, it's still in review. But expect it today or tomorrow if all goes well).</div><div><br></div><div>So we'll try this:</div><div><br></div><div><div>bin\llvm-pdbdump raw -subsections=unknown cpptest.pdb</div><div>DBI Stream {</div><div> # snip</div><div> Modules [<br></div><div> {<br></div><div> Name: test2.obj</div><div> # snip</div><div> Subsections [<br></div><div> Unknown {</div><div> Kind: 0xFD</div><div> Data (</div><div> 0000: 00000000 00000000 00000000 00000000 |................|</div><div> 0010: 00000000 00000000 00000000 00000000 |................|</div><div> 0020: 00000000 00000000 00000000 B0240100 |.............$..|</div><div> 0030: 00000000 00000000 00000000 00000000 |................|</div><div> 0040: 00000000 B0240100 90270100 D0270100 |.....$...'...'..|</div><div> 0050: 90990100 00000000 00000000 90990100 |................|</div><div> 0060: A49C0100 00000000 00000000 A49C0100 |................|</div><div> )</div><div> }</div><div> ]</div><div> }</div><div><br></div></div><div>Neat! What is this thing? 0xFD is 253, and looking that up in our <a href="https://github.com/llvm-mirror/llvm/blob/master/include/llvm/DebugInfo/CodeView/CodeView.h#L317">DebugSubsectionKind enumeration</a> shows that this is a CoffSymbolRVA subsection.</div><div><br></div><div>The format of that subsection can very likely be understood by reading the code in the Microsoft repo, but I haven't investigated it yet.</div><div><br></div><div>Hopefully this is a good starting point. llvm-pdbdump is a pretty useful tool for investigating these types of issues, so let me know if you try it out and have suggestions for how to improve it.</div><div><br></div><div>As mentioned, some of the commands I demonstrated above are still not upstream yet, but I'll try to get it in this week.</div><br><div class="gmail_quote"><div dir="ltr">On Thu, Jun 8, 2017 at 5:07 AM Will Wilson <<a href="mailto:will@indefiant.com" target="_blank">will@indefiant.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi Zach (or anyone else who may have a clue),<div><br></div><div>I'm currently investigating making use of LLVM for PDB parsing for with a view to supporting partial PDBs as produced by /DEBUG:FASTLINK as the VS DIA SDK hasn't been updated to handle them. I know this is probably low on your priority list but since /DEBUG:FASTLINK is now the implied default for VS2017 I figure it's a good time to take a look at it.</div><div><br></div><div>Unfortunately I'm finding very little information on the internal structure used by partial PDBs. It seems <a href="https://github.com/Microsoft/microsoft-pdb" target="_blank">https://github.com/Microsoft/microsoft-pdb</a> doesn't offer much either, unless I'm missing something...</div><div><br></div><div>So, two questions: Are you planning to try and support partial PDBs? And do you have any good references for their layout?</div><div><br></div><div>Many thanks,</div><div>Will.<br clear="all"><div><br></div>
</div></div>
</blockquote></div></div>