[PATCH] D81296: [PDB] Defer public serialization until PDB writing
Reid "Away June-Sep" Kleckner via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 30 11:24:44 PDT 2020
rnk marked 8 inline comments as done.
rnk added inline comments.
================
Comment at: llvm/include/llvm/DebugInfo/PDB/Native/GSIStreamBuilder.h:122
+ // PublicSymFlags.
+ uint16_t Flags : 3;
----------------
aganea wrote:
> `PublicSymFlags` needs 4 bits, you could take one from `BucketIdx` below since you don't need more than 12 bits for that - you're doing `% IPHR_HASH` below.
Ah, true. I looked at the enum and thought it was a four value enum, not a four *flag* enum.
================
Comment at: llvm/lib/DebugInfo/PDB/Native/GSIStreamBuilder.cpp:43
std::vector<PSHashRecord> HashRecords;
std::array<support::ulittle32_t, (IPHR_HASH + 32) / 32> HashBitmap;
std::vector<support::ulittle32_t> HashBuckets;
----------------
hans wrote:
> I see this was just moved, but if the goal is to compute ceil(IPHR_HASH / 32) then I think +31 instead of +32 would be enough.
For some reason, the PDB on disk has `IPHR_HASH + 1` buckets, and this needs to be reflected in the bitmap, which needs `ceil(IPHR_HASH+1/32)` words. The last bucket is always empty, since the hash is always calculated as `crc32 % IPHR_HASH` (not `%IPHR_HASH+1`).
I don't see any comments about this, but I thought they were in here somewhere, since I remember dealing with this edge case in the past. I'll add some.
================
Comment at: llvm/lib/DebugInfo/PDB/Native/GSIStreamBuilder.cpp:198
+ // bucket start offsets.
+ uint32_t BucketStarts[IPHR_HASH];
+ memset(&BucketStarts[0], 0, sizeof(BucketStarts));
----------------
aganea wrote:
> You could also do a one-liner `uint32_t BucketStarts[IPHR_HASH] = {0};`
Sure. I think I tend to shy away from this form, even though I like it, because different compilers tend to have conflicting warnings like `-Wmissing-field-initializers` `-Wmissing-braces` and others. I think they don't apply in this case, though.
================
Comment at: llvm/lib/DebugInfo/PDB/Native/GSIStreamBuilder.cpp:203
+ uint32_t Sum = 0;
+ for (uint32_t &B : BucketStarts) {
+ uint32_t Size = B;
----------------
aganea wrote:
> This also could be:
> ```
> for (unsigned I = 1; I < IPHR_HASH; ++I)
> BucketStarts[I] += BucketStarts[I-1];
> BucketStarts[0] = 0;
> ```
I don't think that's quite the same. Suppose we have three buckets, all size 1. Your loop will produce: `0, 2, 3`, but I think the desired start offsets are `0, 1, 2`.
I think the fancy way to do this is `std::exclusive_scan`, but it's new in C++17:
https://en.cppreference.com/w/cpp/algorithm/exclusive_scan
================
Comment at: llvm/lib/DebugInfo/PDB/Native/GSIStreamBuilder.cpp:213
+ HashRecords.resize(Records.size());
+ uint32_t BucketCursors[IPHR_HASH];
+ memcpy(BucketCursors, BucketStarts, sizeof(BucketCursors));
----------------
aganea wrote:
> Same as above: `uint32_t BucketCursors[IPHR_HASH] = {0};`
The initialization stores would be made dead by the memcpy on the next line, though.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D81296/new/
https://reviews.llvm.org/D81296
More information about the llvm-commits
mailing list