[llvm] d81b014 - [NFC][Bitstream] Improve the dumpability of bitstream/bitcode headers

Teresa Johnson via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 5 15:10:58 PDT 2022


Author: William Woodruff
Date: 2022-04-05T15:10:49-07:00
New Revision: d81b014469a5c96c0fd5b2e441472f26cfa8a53b

URL: https://github.com/llvm/llvm-project/commit/d81b014469a5c96c0fd5b2e441472f26cfa8a53b
DIFF: https://github.com/llvm/llvm-project/commit/d81b014469a5c96c0fd5b2e441472f26cfa8a53b.diff

LOG: [NFC][Bitstream] Improve the dumpability of bitstream/bitcode headers

The `LLVMBitCodes.h` header contains various enums that are updated whenever LLVM's bitcode fundamentally changes. It would be nice to track these changes in a semi-automated way, so that external tools that attempt to parse LLVM's bitstream and bitcode can remain in sync.

Before this change, `LLVMBitCodes.h` had a single dependency -- it needed the `FIRST_APPLICATION_BLOCKID` enum value from `BitCodes.h`. `BitCodes.h`, in turn, had a whole tree of include dependencies that boiled down to `llvm-config.h`, meaning that it was impossible to dump the AST of either file without having a partial or full LLVM build tree already present.

To eliminate that requirement, this patch introduces a new leaf-only header, `BitCodeEnums.h`, which includes the "core" enums originally in `BitCodes.h`. `LLVMBitCodes.h` and `BitCodes.h` both include this new header in turn, preserving the current header relationships while allowing `LLVMBitCodes.h` to be dumped fully independently with a command like this (run from the repository root):

```
clang -fsyntax-only -x c++ -Illvm/include -Xclang -ast-dump=json -Xclang -ast-dump-filter -Xclang llvm::bitc::BlockIDs llvm/include/llvm/Bitcode/LLVMBitCodes.h
```

I recognize that this is a pretty unusual change and perhaps not a guarantee that the LLVM authors would like to make in the general case (i.e., that individual files within LLVM can have their AST dumped with minimal dependencies). However, I believe the criticality/limited scope of the file(s) in this patch warrants an exception. Please let me know if there's any other information I can provide, or anything else I can do to improve this patch!

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D108438

Added: 
    llvm/include/llvm/Bitstream/BitCodeEnums.h

Modified: 
    llvm/include/llvm/Bitcode/LLVMBitCodes.h
    llvm/include/llvm/Bitstream/BitCodes.h

Removed: 
    


################################################################################
diff  --git a/llvm/include/llvm/Bitcode/LLVMBitCodes.h b/llvm/include/llvm/Bitcode/LLVMBitCodes.h
index 1f6c13ef457b5..2abb306f89d55 100644
--- a/llvm/include/llvm/Bitcode/LLVMBitCodes.h
+++ b/llvm/include/llvm/Bitcode/LLVMBitCodes.h
@@ -17,7 +17,10 @@
 #ifndef LLVM_BITCODE_LLVMBITCODES_H
 #define LLVM_BITCODE_LLVMBITCODES_H
 
-#include "llvm/Bitstream/BitCodes.h"
+// This is the only file included, and it, in turn, is a leaf header.
+// This allows external tools to dump the AST of this file and analyze it for
+// changes without needing to fully or partially build LLVM itself.
+#include "llvm/Bitstream/BitCodeEnums.h"
 
 namespace llvm {
 namespace bitc {

diff  --git a/llvm/include/llvm/Bitstream/BitCodeEnums.h b/llvm/include/llvm/Bitstream/BitCodeEnums.h
new file mode 100644
index 0000000000000..4288bd3987ae5
--- /dev/null
+++ b/llvm/include/llvm/Bitstream/BitCodeEnums.h
@@ -0,0 +1,90 @@
+//===- BitCodeEnums.h - Core enums for the bitstream format -----*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This header defines "core" bitstream enum values.
+// It has been separated from the other header that defines bitstream enum
+// values, BitCodes.h, to allow tools to track changes to the various
+// bitstream and bitcode enums without needing to fully or partially build
+// LLVM itself.
+//
+// The enum values defined in this file should be considered permanent.  If
+// new features are added, they should have values added at the end of the
+// respective lists.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_BITSTREAM_BITCODEENUMS_H
+#define LLVM_BITSTREAM_BITCODEENUMS_H
+
+namespace llvm {
+/// Offsets of the 32-bit fields of bitstream wrapper header.
+enum BitstreamWrapperHeader : unsigned {
+  BWH_MagicField = 0 * 4,
+  BWH_VersionField = 1 * 4,
+  BWH_OffsetField = 2 * 4,
+  BWH_SizeField = 3 * 4,
+  BWH_CPUTypeField = 4 * 4,
+  BWH_HeaderSize = 5 * 4
+};
+
+namespace bitc {
+enum StandardWidths {
+  BlockIDWidth = 8,   // We use VBR-8 for block IDs.
+  CodeLenWidth = 4,   // Codelen are VBR-4.
+  BlockSizeWidth = 32 // BlockSize up to 2^32 32-bit words = 16GB per block.
+};
+
+// The standard abbrev namespace always has a way to exit a block, enter a
+// nested block, define abbrevs, and define an unabbreviated record.
+enum FixedAbbrevIDs {
+  END_BLOCK = 0, // Must be zero to guarantee termination for broken bitcode.
+  ENTER_SUBBLOCK = 1,
+
+  /// DEFINE_ABBREV - Defines an abbrev for the current block.  It consists
+  /// of a vbr5 for # operand infos.  Each operand info is emitted with a
+  /// single bit to indicate if it is a literal encoding.  If so, the value is
+  /// emitted with a vbr8.  If not, the encoding is emitted as 3 bits followed
+  /// by the info value as a vbr5 if needed.
+  DEFINE_ABBREV = 2,
+
+  // UNABBREV_RECORDs are emitted with a vbr6 for the record code, followed by
+  // a vbr6 for the # operands, followed by vbr6's for each operand.
+  UNABBREV_RECORD = 3,
+
+  // This is not a code, this is a marker for the first abbrev assignment.
+  FIRST_APPLICATION_ABBREV = 4
+};
+
+/// StandardBlockIDs - All bitcode files can optionally include a BLOCKINFO
+/// block, which contains metadata about other blocks in the file.
+enum StandardBlockIDs {
+  /// BLOCKINFO_BLOCK is used to define metadata about blocks, for example,
+  /// standard abbrevs that should be available to all blocks of a specified
+  /// ID.
+  BLOCKINFO_BLOCK_ID = 0,
+
+  // Block IDs 1-7 are reserved for future expansion.
+  FIRST_APPLICATION_BLOCKID = 8
+};
+
+/// BlockInfoCodes - The blockinfo block contains metadata about user-defined
+/// blocks.
+enum BlockInfoCodes {
+  // DEFINE_ABBREV has magic semantics here, applying to the current SETBID'd
+  // block, instead of the BlockInfo block.
+
+  BLOCKINFO_CODE_SETBID = 1,       // SETBID: [blockid#]
+  BLOCKINFO_CODE_BLOCKNAME = 2,    // BLOCKNAME: [name]
+  BLOCKINFO_CODE_SETRECORDNAME = 3 // BLOCKINFO_CODE_SETRECORDNAME:
+                                   //                             [id, name]
+};
+
+} // namespace bitc
+} // namespace llvm
+
+#endif

diff  --git a/llvm/include/llvm/Bitstream/BitCodes.h b/llvm/include/llvm/Bitstream/BitCodes.h
index 036e2547430ba..93888f7d3b335 100644
--- a/llvm/include/llvm/Bitstream/BitCodes.h
+++ b/llvm/include/llvm/Bitstream/BitCodes.h
@@ -19,75 +19,12 @@
 
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Bitstream/BitCodeEnums.h"
 #include "llvm/Support/DataTypes.h"
 #include "llvm/Support/ErrorHandling.h"
 #include <cassert>
 
 namespace llvm {
-/// Offsets of the 32-bit fields of bitstream wrapper header.
-enum BitstreamWrapperHeader : unsigned {
-  BWH_MagicField   = 0 * 4,
-  BWH_VersionField = 1 * 4,
-  BWH_OffsetField  = 2 * 4,
-  BWH_SizeField    = 3 * 4,
-  BWH_CPUTypeField = 4 * 4,
-  BWH_HeaderSize   = 5 * 4
-};
-
-namespace bitc {
-  enum StandardWidths {
-    BlockIDWidth   = 8,  // We use VBR-8 for block IDs.
-    CodeLenWidth   = 4,  // Codelen are VBR-4.
-    BlockSizeWidth = 32  // BlockSize up to 2^32 32-bit words = 16GB per block.
-  };
-
-  // The standard abbrev namespace always has a way to exit a block, enter a
-  // nested block, define abbrevs, and define an unabbreviated record.
-  enum FixedAbbrevIDs {
-    END_BLOCK = 0,  // Must be zero to guarantee termination for broken bitcode.
-    ENTER_SUBBLOCK = 1,
-
-    /// DEFINE_ABBREV - Defines an abbrev for the current block.  It consists
-    /// of a vbr5 for # operand infos.  Each operand info is emitted with a
-    /// single bit to indicate if it is a literal encoding.  If so, the value is
-    /// emitted with a vbr8.  If not, the encoding is emitted as 3 bits followed
-    /// by the info value as a vbr5 if needed.
-    DEFINE_ABBREV = 2,
-
-    // UNABBREV_RECORDs are emitted with a vbr6 for the record code, followed by
-    // a vbr6 for the # operands, followed by vbr6's for each operand.
-    UNABBREV_RECORD = 3,
-
-    // This is not a code, this is a marker for the first abbrev assignment.
-    FIRST_APPLICATION_ABBREV = 4
-  };
-
-  /// StandardBlockIDs - All bitcode files can optionally include a BLOCKINFO
-  /// block, which contains metadata about other blocks in the file.
-  enum StandardBlockIDs {
-    /// BLOCKINFO_BLOCK is used to define metadata about blocks, for example,
-    /// standard abbrevs that should be available to all blocks of a specified
-    /// ID.
-    BLOCKINFO_BLOCK_ID = 0,
-
-    // Block IDs 1-7 are reserved for future expansion.
-    FIRST_APPLICATION_BLOCKID = 8
-  };
-
-  /// BlockInfoCodes - The blockinfo block contains metadata about user-defined
-  /// blocks.
-  enum BlockInfoCodes {
-    // DEFINE_ABBREV has magic semantics here, applying to the current SETBID'd
-    // block, instead of the BlockInfo block.
-
-    BLOCKINFO_CODE_SETBID        = 1, // SETBID: [blockid#]
-    BLOCKINFO_CODE_BLOCKNAME     = 2, // BLOCKNAME: [name]
-    BLOCKINFO_CODE_SETRECORDNAME = 3  // BLOCKINFO_CODE_SETRECORDNAME:
-                                      //                             [id, name]
-  };
-
-} // End bitc namespace
-
 /// BitCodeAbbrevOp - This describes one or more operands in an abbreviation.
 /// This is actually a union of two 
diff erent things:
 ///   1. It could be a literal integer value ("the operand is always 17").
@@ -183,6 +120,6 @@ class BitCodeAbbrev {
     OperandList.push_back(OpInfo);
   }
 };
-} // End llvm namespace
+} // namespace llvm
 
 #endif


        


More information about the llvm-commits mailing list