[PATCH] D41102: Setup clang-doc frontend framework

Julie Hockett via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Feb 26 17:07:56 PST 2018


juliehockett added a comment.

In https://reviews.llvm.org/D41102#1017918, @lebedev.ri wrote:

> Is there some (internal to `BitstreamWriter`) logic that would 'assert()' if trying to output some recordid
>  which is, according to the `BLOCKINFO_BLOCK`, should not be there?
>  E.g. outputting `VERSION` in `BI_COMMENT_BLOCK_ID`?


Yes -- it will fail an assertion:
`Assertion 'V == Op.getLiteralValue() && "Invalid abbrev for record!"' failed.`



================
Comment at: clang-doc/BitcodeWriter.cpp:191
+  Record.clear();
+  for (const char C : BlockIdNameMap[ID]) Record.push_back(C);
+  Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, Record);
----------------
lebedev.ri wrote:
> Why do we have this indirection?
> Is there a need to first to (unefficiently?) copy to `Record`, and then emit from there?
> Wouldn't this work just as well?
> ```
> Record.clear();
> Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, BlockIdNameMap[ID]);
> ```
No, since `BlockIdNameMap[ID]` returns a `StringRef`, which can be manipulated into an `std::string` or a `const char*`, but the `Stream` wants an `unsigned char`. So, the copying is to satisfy that. Unless there's a better way to convert a `StringRef` into an array of `unsigned char`?


================
Comment at: clang-doc/BitcodeWriter.cpp:196
+/// \brief Emits a record name to the BLOCKINFO block.
+void ClangDocBitcodeWriter::emitRecordID(RecordId ID) {
+  assert(RecordIdNameMap[ID] && "Unknown Abbreviation");
----------------
lebedev.ri wrote:
> Hmm, so i've been staring at this and http://llvm.org/doxygen/classllvm_1_1BitstreamWriter.html and i must say i'm not fond of this indirection.
> 
> What i don't understand is, in previous function, we don't store `BlockId`, why do we want to store `RecordId`?
> Aren't they both unstable, and are implementation detail?
> Do we want to store it (`RecordId`)? If yes, please explain it as a new comment in code.
> 
> If no, i guess this would work too?
> ```
> assert(RecordIdNameMap[ID] && "Unknown Abbreviation");
> Record.clear();
> Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_SETRECORDNAME, RecordIdNameMap[ID].Name);
> ```
> And after that you can lower the default size of `SmallVector<> Record` down to, hm, `4`?
I'm not entirely certain what you mean -- in `emitBlockId()`, we are storing both the block id and the block name in separate records (`BLOCKINFO_CODE_SETBID`, `BLOCKINFO_CODE_BLOCKNAME`, respectively). In `emitRecordId()`, we're doing something slightly different, in that we emit one record with both the record id and the record name (in record `BLOCKINFO_CODE_SETRECORDNAME`). 

Replacing the copy loop here has the same issue as above, namely that there isn't an easy way to convert between a `StringRef` and an array of `unsigned char`.


================
Comment at: clang-doc/BitcodeWriter.cpp:240
+  if (!prepRecordData(ID, Val)) return;
+  assert(Val < (1U << BitCodeConstants::IntSize));
+  Record.push_back(Val);
----------------
lebedev.ri wrote:
> Ok, now that i think about it, it can't be that easy.
> Maybe
> ```
> FIXME: assumes 8 bits per byte
> assert(llvm::APInt(8U*sizeof(Val), Val, /*isSigned=*/true).getBitWidth() <= BitCodeConstants::IntSize));
> ```
> Not sure whether `getBitWidth()` is really the right function to ask though.
> (Not sure how this all works for negative numbers)
That assertion fails :/ I could do something like `static_cast<int64_t>(Val) == Val` but that would require a) IntSize being a power of 2 b) updating the assert anytime IntSize is updated, and 3) still throws a warning about comparing a signed to an unsigned int...


================
Comment at: clang-doc/BitcodeWriter.h:53
+  BI_LAST = BI_COMMENT_BLOCK_ID
+};
+
----------------
lebedev.ri wrote:
> juliehockett wrote:
> > lebedev.ri wrote:
> > > So what *exactly* does `BitCodeConstants::SubblockIDSize` mean?
> > > ```
> > > static_assert(BI_LAST < (1U << BitCodeConstants::SubblockIDSize), "Too many block id's!");
> > > ```
> > > ?
> > It's the current abbrev id width for the block (described [[ https://llvm.org/docs/BitCodeFormat.html#enter-subblock-encoding | here ]]), so it's the max id width for the block's abbrevs.
> So in other words that `static_assert()` is doing the right thing?
> Add it after the `enum BlockId{}` then please, will both document things, and ensure that things remain in a sane state.
No...it's the (max) number of the abbrevs relevant to the block itself, which is to say some subset of the RecordIds for any given block (e.g. for a `BI_COMMENT_BLOCK`,  the number of abbrevs would be 12 and so on the abbrev width would be 4). 

To assert for it we could put block start/end markers on the RecordIds and then use that to calculate the bitwidth, if you think the assertion should be there.


https://reviews.llvm.org/D41102





More information about the cfe-commits mailing list