[PATCH] Refactor bitcode reader to simplify control.

Karl Schimpf kschimpf at google.com
Wed Apr 8 11:34:47 PDT 2015


DId all changes except adding test that forward block address references get resolved on lazy loads.


================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:235
@@ +234,3 @@
+    AtTopLevel,    // Processing top-level records.
+    InsideModule,  // processing records inside a module block.
+    // All states below here represent cases where input shouldn't be parsed.
----------------
jvoung wrote:
> be consistent about capitalization in comments
> 
> one line starts with "parsed input" and another line starts with "Parsed input" =)
Done.

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:257
@@ -228,3 +256,3 @@
 
-  explicit BitcodeReader(MemoryBuffer *buffer, LLVMContext &C,
+  explicit BitcodeReader(MemoryBuffer *Buffer, LLVMContext &C,
                          DiagnosticHandlerFunction DiagnosticHandler);
----------------
jvoung wrote:
> Don't need explicit anymore, though that transition was a while back so not really related to this CL.
Done.

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:275
@@ -246,2 +274,3 @@
 
-  /// @brief Main interface to parsing a bitcode buffer.
+  /// @brief Starts an incremental parse for module M. Reads enough to
+  /// define global values. The flags define what should happen before
----------------
jvoung wrote:
> http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments
> 
> says "\brief" instead of @brief
Sorry, my fault. I followed the syntax of ConstantPlaceHolder below. Changing to follow coding standards.

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:283
@@ +282,3 @@
+  /// @returns true if an error occurred.
+  std::error_code StartParse(Module *M,
+                             bool MaterializeAll,
----------------
jvoung wrote:
> lowercase first letter of function name -- should probably do it for these new functions since you're touching it (but leave existing functions alone?)
Good point. Fixing.

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:284
@@ +283,3 @@
+  std::error_code StartParse(Module *M,
+                             bool MaterializeAll,
+                             bool ShouldLazyLoadMetadata = false);
----------------
jvoung wrote:
> Variable name is different from comment "MaterializeAll" vs "ShouldMaterializeAll" -- make them the same? I see that in the actual definition you're trying to avoid conflicting with the field name...
Fixing the names to be consistent. Fixing name conflict by prefixing assignments of field names with "this->".

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:288
@@ -248,2 +287,3 @@
+  /// @brief Parses bitcode. Materializes based on flags.
   /// @returns true if an error occurred.
   std::error_code ParseBitcodeInto(Module *M,
----------------
jvoung wrote:
> \returns
Done.

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:397
@@ +396,3 @@
+  // comes first.
+  std::error_code ContinueParse();
+
----------------
jvoung wrote:
> lowercase first letter for new function (and I guess update the commit message if you do)
Done.

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:401
@@ +400,3 @@
+  // StartParse().
+  std::error_code FinishParse();
+
----------------
jvoung wrote:
> same
Done.

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:409
@@ +408,3 @@
+                                     std::error_code &EC) {
+    ParseState = EC ? ParseError : NewValue;
+    NextUnreadBit = Stream.GetCurrentBitNo();
----------------
jvoung wrote:
> In the review, I've tried to look for where ParseState gets set, and there are various ways to grep for that...
> 
> one is ParseState = X... another is updateParseState(Y, ...), or updateParseState(Z);
> 
> Is there any way you can make the number of variations smaller?
I guess the first part of the problem is that there are 2 notions of parse state:

1) The field ParseState that names the state of the parser, and
2) NextUnreadBit which defines where to continue the parse on return (in case a function body gets parsed between calls).

I also overloaded the return value with this update. Refactoring to do less and be more clear.

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:424
@@ +423,3 @@
+
+  std::error_code updateParseState() {
+    NextUnreadBit = Stream.GetCurrentBitNo();
----------------
jvoung wrote:
> This is a bit weird to me... you have a bunch of function called "updateParseState" but some variants modify ParseState and some variants don't.
The refactoring is a bit better now. Hopefully good enough.

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:2823
@@ +2822,3 @@
+        // Suspend parsing when we reach a function body, assuming we
+        // have already associated names with global values. NOte: If
+        // the bitcode file is old, the symbol table will be at the
----------------
jvoung wrote:
> NOte -> Note
Done.

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:3241
@@ +3240,3 @@
+
+  if (ShouldMaterializeAll) {
+    if (std::error_code EC = MaterializeModule(TheModule))
----------------
jvoung wrote:
> I don't quite understand how "ShouldMaterializeAll = false" is supposed to work for the streaming case, if this isn't checked until after:
> 
> while (ParseState < NoMoreInput) {
>     if (std::error_code EC = ContinueParse()) {
>       return EC;
>     }
>   }
> 
> How do you delay reading until materialize(GV) for streaming?
After talking to Derek, I realized what was the issue I was missing. I'l summarize what I understand:

1) When streaming, we want to "return" as soon as possible, without having to force all bitcode to be scanned. This reduces the cost of (potential) blocking calls to the data streamer.

2) Control can return to the caller without having completed the parse. However, the parsed portions must be consistent (i.e. forward block address references have been resolved).

Based on this, I've modified the code to lift the materializeForwardReferencedFunctions into startParse.

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:3245
@@ +3244,3 @@
+  } else {
+    if (std::error_code EC = materializeForwardReferencedFunctions())
+      return EC;
----------------
jvoung wrote:
> This used to be early, in "getLazyBitcodeModuleImpl". This looks like it is now happening late in FinishParse after the loop until NoMoreInput.
> 
> Why is this okay now? Was the early call to "materializeForwardReferencedFunctions" actually extraneous because of the call in materialize(GV), or what?
> 
> Make sure to check the lazy case with blockaddresses for computed gotos, if there isn't already a unittest for that.
I agree that there should be some type of forward reference unit test to verify we can lazy evaluate these forward-referenced block addresses.

I also agree that for the use by llvm-dis, my original code worked because it eventually calls materializeAllPermanently. I think I may have been confused about the full expectation of "streamed" (or lazy) was because of this.

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:4473
@@ -4331,3 +4472,3 @@
   // blockaddresses.
-  return materializeForwardReferencedFunctions();
+  return  materializeForwardReferencedFunctions();
 }
----------------
jvoung wrote:
> no need for extra space
Done.

http://reviews.llvm.org/D8786

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list