[PATCH] D134055: [clang-doc] Export enum type and value information.

Paul Kirth via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Sep 16 15:34:52 PDT 2022


paulkirth added a comment.

I think maybe you made the title the first line of the summary instead of the other way around.  I was looking for this as the title: `[clang-doc] Add support for explicitly typed enums`



================
Comment at: clang-tools-extra/clang-doc/BitcodeReader.cpp:71
+
+  llvm::SmallVector<uint64_t, 1024> AsWords;
+  AsWords.resize(WordWidth);
----------------
You can avoid the resize w/ SmallVector(Size) constructor, right?


================
Comment at: clang-tools-extra/clang-doc/BitcodeReader.cpp:53
 
+llvm::Error decodeRecord(const Record &R, llvm::APSInt &Field, llvm::StringRef Blob) {
+  auto ByteWidth = R[0];
----------------
brettw wrote:
> paulkirth wrote:
> > do you need to do all of this? APSInt already supports to/from string methods, as well as converting to/from integers. can you use that here and in the writer to avoid some complexity?
> I don't think converting to an integer is a good idea because people sometimes use min/max values for enum values and since this could be signed or unsigned, it gets kind of complicated.
> 
> Serializing a number as a string to bitcode also seemed wrong to me.
> 
> The simplest thing would be to store the value as a string in the EnumValueInfo and then always treat this as a string from then on. If you want it simplified, I think that's the thing to do. But I thought you would want the numeric value stored in clang-doc's "ast" because some backends may want to treat this as a number, check its signed/unsignedness, etc. I happy to change this if you want.
Those are fair points, and I think I misread/misunderstood a bit of what's going on here.

As for encoding/decoding integers,  BitcodeWriter already has integer support, so if you need to convert to/from those, it should already work, right?

Regardless, you may want to look at the bitcode reader/writer in llvm to see how they serialize/deserialize APInt, as their implementation seems a bit more straightforward IMO.

https://github.com/llvm/llvm-project/blob/9050a59c6678789831b7286b8b68d18b966c4694/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp#L1636
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp#L2520
https://github.com/llvm/llvm-project/blob/9050a59c6678789831b7286b8b68d18b966c4694/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L2843
https://github.com/llvm/llvm-project/blob/9050a59c6678789831b7286b8b68d18b966c4694/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L2743


================
Comment at: clang-tools-extra/clang-doc/Representation.h:425
+  // constant. This will be empty for implicit enumeration values.
+  std::string ValueExpr;
+};
----------------
Sorry to nitpick, but SmallString was the correct choice to use in this type. We avoid its use as a return value, because it tends to be brittle, and stops us from assigning into arbitrary sized SmallStrings.  In the struct, it's a reasonable choice, especially if you expect most uses to be small.  for these expressions, I expect most to either be the number itself, or some shift operation, so SmallString<16> was probably more than sufficient.


================
Comment at: clang-tools-extra/clang-doc/Serialize.cpp:316
+  for (const EnumConstantDecl *E : D->enumerators()) {
+    SmallString<128> ValueExpr;
+    if (const Expr* InitExpr = E->getInitExpr())
----------------
brettw wrote:
> paulkirth wrote:
> > why was 128 chosen? Aren't we storing it into a `SmallString<16>` in `EnumValueInfo`? is there some external reason that we expect this to be the right size? 
> > 
> > Do you have an idea for how long these are likely to be? if we expect them to be large or completely unpredictable, it may make more sense to use a `std::string` and avoid wasting stack space.
> > 
> >  
> I changed this and the EnumValueInfo struct to std::string. I think the usage of SmallString in these records is over the top but I was trying to copy the existing style.
small buffer optimizations significantly improve performance in the compiler and related tools. switching to a default string implementation that can use those optimizations reduced the total number of heap allocations in a normal run of clang by almost half. 

That's why LLVM's programmer's manual and contribution guidelines suggest using those whenever possible. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134055/new/

https://reviews.llvm.org/D134055



More information about the cfe-commits mailing list