<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - LLD doesn't output global refs when emitting PDB"
href="https://bugs.llvm.org/show_bug.cgi?id=37992">37992</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>LLD doesn't output global refs when emitting PDB
</td>
</tr>
<tr>
<th>Product</th>
<td>lld
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Windows NT
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>COFF
</td>
</tr>
<tr>
<th>Assignee</th>
<td>zturner@google.com
</td>
</tr>
<tr>
<th>Reporter</th>
<td>zturner@google.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>A module's debug info stream contains a list of symbols, then some codeview
debug subsections (e.g. file checksums, lines, cross module exports, etc), and
at the very end is a list of "global refs". We didn't know what these were
before so we basically just write an empty list.
I now understand what these are (although I don't understand what the debugger
uses them for). To understand what they are, it's helpful to view some output
from llvm-pdbutil.
llvm-pdbutil.exe dump -globals cpptest-lld.pdb
Global Symbols
============================================================
Records
236 | S_PROCREF [size = 32] `Derived::Derived`
module = 1, sum name = 0, offset = 132
216 | S_PROCREF [size = 20] `main`
module = 1, sum name = 0, offset = 52
268 | S_PROCREF [size = 28] `Derived::V2`
module = 1, sum name = 0, offset = 228
296 | S_PROCREF [size = 28] `Base::Base`
module = 1, sum name = 0, offset = 320
324 | S_PROCREF [size = 24] `Base::V2`
module = 1, sum name = 0, offset = 412
This data all comes from the globals stream. The "global refs" section at the
end of the module stream si basically the reverse mapping. It presumably
allows the debugger to quickly find all global symbols referenced by a
particular module. I'm not sure why it uses this, but in any case, the format
appears to be:
0x0000: <Number of bytes used by the following list>
0x0004: <Offset of 1'st item in the global symbol symbol stream referenced by
this module>
0x0008: <Offset of 2'nd item in the global symbol symbol stream referenced by
this module>
0x0000 + 4*N: <Offset of N'th in the global symbol symbol stream referenced by
this module>
So for the above example, we can see that for module 1, the offsets are 236,
216, 268, 324, 296. And there are 5 such global symbols referenced by module
1. So module's 1 global ref array would be, in little endian binary:
0x0: 14000000 // 0x14 = 20 bytes follow (20 / 4 = 5 entries)
0x4: EC000000 // 0xEC = 236 is the offset of the first global ref
0x8: D8000000 // 0xD8 = 216 is the offset of the second global ref
0xC: 0C010000 // 0x10C = 268 is the offset of the third global ref
0x10: 28010000 // 0x128 = 296 is the offset of the fourth global ref
0x14: 44010000 // 0x144 = 324 is the offset of the fifth global ref</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>