[PATCH] D54554: [PDB] Add symbol records in bulk

Fri Nov 16 14:47:50 PST 2018

rnk marked 2 inline comments as done.
rnk added inline comments.

================
Comment at: lld/COFF/PDB.cpp:813
     // FileStaticSym::ModFileOffset
-    recordStringTableReferenceAtOffset(Contents, 4, StrTableRefs);
+    recordStringTableReferenceAtOffset(Contents, 8, StrTableRefs);
     break;
----------------
zturner wrote:
> Why did this change from 4 to 8?
I changed it to take the entire record including the 4 byte prefix, which made more sense to me. It simplifies the caller.

================
Comment at: lld/COFF/PDB.cpp:1070
+  if (NeedsRealignment) {
+    void *AlignedData = Alloc.Allocate(RealignedSize, 4);
+    AlignedSymbolMem = makeMutableArrayRef(
----------------
zturner wrote:
> Is the 4 here supposed to be the alignment?  How about `alignOf(CodeViewContainer::Pdb)`?
It is. Would you be open to coming up with a less noisy of writing that, like (codeview::)SymbolAlignmentForPDB?

================
Comment at: lld/COFF/PDB.cpp:1078
+  ArrayRef<uint8_t> BulkSymbols;
+  cantFail(forEachCodeViewRecord<CVSymbol>(
+      SymsBuffer, [&](CVSymbol Sym) -> llvm::Error {
----------------
zturner wrote:
> I feel like this is do-able with just less than 2 passes.
> 
> What if we kept a vector of offsets into the `SymData` buffer, and the offsets would represent the position of records which required an alignment adjustment.  If you get through the loop and the vector is empty, then that is the same as `NeedsAlignment == false`.
> 
> As you iterate the records, you remap the type indices in the original buffer, which may still require alignment, but you don't actually do any re-alignment yet.  For each record that does require realignment, you push its offset onto the vector.  Note that you can collapse adjacent alignments, so for example if you have Record 1 which is at offset 3 mod 4, and Record 2 is also offset 3 mod 4, then only 1 re-alignment is required.  Because as soon as you realign the output buffer to 0 mod 4, then both records will be re-aligned.
> 
> So you're basically pushing offsets onto the vector when offset % 4 changes.
> 
> At the end of this process, all remapping has been done, but no re-alignment has occurred.
> 
> If the offset vector is empty, you're done.  Just return, and you only made 1 pass over the records.
> 
> If the offset vector is not empty, you walk it, and for each offset you add Offset[n] - Offset[n-1] bytes of padding, then copy in bulk up to the next offset.
> 
> Now, in the best case scenario, you only have 1 pass over the inputs, but in the worst case scenario, you don't have O(2n), you have O(n+m) where m < n.
I like the idea of remapping in place in the first pass, and doing the realignment as an optional second pass, that's a good improvement. This is also similar to what I described below, starting with "My plan". There are basically two cases:
1. every record is aligned (not implemented yet)
2. most records are unaligned
Your proposal requires memcpying every unaligned record in a second pass, but it uses a temporary vector on the side to hold the offsets. I don't think the vector is needed, we can just iterate again with `forEachCodeViewRecord`.

Basically, the plan is to combine the two loops that we have here, and do a second pass to realign. The drawback is that we have to "filter" records into global and module streams during the second pass if it runs, which means we're doing "wasted" work on the first pass if the object file isn't prealigned.

I don't want to implement this optimization in this patch. I'd rather combine it with an LLVM change to pre-align the records, do what we've described here, and benchmark that by itself to see if it's worth the complexity.

What we have here with the up-front pass is probably fastest for unaligned symbols, so IMO we should stick with it for now.

https://reviews.llvm.org/D54554