[PATCH] D125145: [Bitstream] Only consider flushing to file on block boundaries

Sam McCall via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 6 16:37:58 PDT 2022


sammccall created this revision.
sammccall added reviewers: MaskRay, stephan.yichao.zhao.
Herald added subscribers: StephenFan, usaxena95, kadircet.
Herald added a project: All.
sammccall requested review of this revision.
Herald added subscribers: llvm-commits, ilya-biryukov.
Herald added a project: LLVM.

The goal of flushing to disk is to keep a reasonable bound on peak memory usage.
With a a default threshold of 512MB (and most BitstreamWriters having no backing
file at all), checking after every byte whether to flush seems excessive.

This change makes clangd's unittests run 5% faster (in opt), so it's not
actually free even in the case with no backing file. Likely there are more
important workloads where it makes some difference.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D125145

Files:
  llvm/include/llvm/Bitstream/BitstreamWriter.h


Index: llvm/include/llvm/Bitstream/BitstreamWriter.h
===================================================================
--- llvm/include/llvm/Bitstream/BitstreamWriter.h
+++ llvm/include/llvm/Bitstream/BitstreamWriter.h
@@ -74,16 +74,10 @@
   };
   std::vector<BlockInfo> BlockInfoRecords;
 
-  void WriteByte(unsigned char Value) {
-    Out.push_back(Value);
-    FlushToFile();
-  }
-
   void WriteWord(unsigned Value) {
     Value = support::endian::byte_swap<uint32_t, support::little>(Value);
     Out.append(reinterpret_cast<const char *>(&Value),
                reinterpret_cast<const char *>(&Value + 1));
-    FlushToFile();
   }
 
   uint64_t GetNumOfFlushedBytes() const { return FS ? FS->tell() : 0; }
@@ -114,7 +108,7 @@
   /// null, \p O does not flush incrementially, but writes to disk at the end.
   ///
   /// \p FlushThreshold is the threshold (unit M) to flush \p O if \p FS is
-  /// valid.
+  /// valid. Flushing only occurs at (sub)block boundaries.
   BitstreamWriter(SmallVectorImpl<char> &O, raw_fd_stream *FS = nullptr,
                   uint32_t FlushThreshold = 512)
       : Out(O), FS(FS), FlushThreshold(FlushThreshold << 20), CurBit(0),
@@ -249,8 +243,8 @@
 
     // Emit the bits with VBR encoding, NumBits-1 bits at a time.
     while (Val >= Threshold) {
-      Emit(((uint32_t)Val & ((1 << (NumBits-1))-1)) |
-           (1 << (NumBits-1)), NumBits);
+      Emit(((uint32_t)Val & ((1 << (NumBits - 1)) - 1)) | (1 << (NumBits - 1)),
+           NumBits);
       Val >>= NumBits-1;
     }
 
@@ -327,6 +321,7 @@
     CurCodeSize = B.PrevCodeSize;
     CurAbbrevs = std::move(B.PrevAbbrevs);
     BlockScope.pop_back();
+    FlushToFile();
   }
 
   //===--------------------------------------------------------------------===//
@@ -472,14 +467,12 @@
     FlushToWord();
 
     // Emit literal bytes.
-    for (const auto &B : Bytes) {
-      assert(isUInt<8>(B) && "Value too large to emit as byte");
-      WriteByte((unsigned char)B);
-    }
+    assert(llvm::all_of(Bytes, [](UIntTy B) { return isUInt<8>(B); }));
+    Out.append(Bytes.begin(), Bytes.end());
 
     // Align end to 32-bits.
     while (GetBufferOffset() & 3)
-      WriteByte(0);
+      Out.push_back(0);
   }
   void emitBlob(StringRef Bytes, bool ShouldEmitSize = true) {
     emitBlob(makeArrayRef((const uint8_t *)Bytes.data(), Bytes.size()),


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D125145.427789.patch
Type: text/x-patch
Size: 2354 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220506/a9133db7/attachment.bin>


More information about the llvm-commits mailing list