[LLVMdev] [Patches][RFC] What to do about bitcode streaming.
Duncan P. N. Exon Smith
dexonsmith at apple.com
Thu Dec 18 20:15:02 PST 2014
+llvmdev
> On 2014 Dec 18, at 15:14, Rafael EspĂndola <rafael.espindola at gmail.com> wrote:
>
> Currently we support reading bitcode in 3 ways:
>
> * Read everything upfront.
> * Be lazy about reading the function bodies.
> * Read the bitcode over a streaming interface.
>
> The first two modes are commonly used and well tested. In fact the
> "read everything" mode is implemented by starting lazy and calling
> materializeAllPermanently.
>
> The thirds mode is used in tree only by llvm-dis, which means that
> some bugs can hide for quite some time (222505 for example). To the
> best of my knowledge it is only real user is PNaCl.
>
> It also has a disproportional code complexity. There are two virtual
> interfaces (DataStreamer and MemoryObject) and they are leaky: Not all
> bitcode features are supported when streaming and there are a few "if
> (LazyStreamer)" int the reader.
>
> I have attached two patches that solve the problem with different trade-offs.
>
> One just deletes the feature. This would make PNaCl responsible for
> maintaining their own reader. I assume they will have to do it at some
> point since they are looked to a fixed format, but this would make it
> an immediate issue.
>
> The other one gets most of the simplifications by adding just one
> assumption: the size is known. This should be true for http with
> Content-Length.
>
> We go from 2 interfaces to 1 and that interface has a single virtual
> method. There are no ifs on the data being streamed or missing
> features.
>
> It might be even possible to drop the requirement for the size to be
> known: Replace the call to AtEndOfStream by just trying to read more
> and checking if it failed, but that is a bit more than I wanted to do
> for this.
>
> Cheers,
> Rafael
I CC'ed llvmdev to put a few more eyes on the "what's the right
direction?" question.
IMO these both look like huge improvements. The streaming interface was
the strangest part of the bitcode reader when I was trying to figure out
how it all fit together for the use-list-order work.
Personally I favour the "just delete it" path, but maybe there's
something I'm missing, and the other patch looks great too.
I didn't read carefully yet (although I noticed two quirks in the first
patch that I've pointed out below) -- I'll have a closer look once
you've decided on a direction.
> diff --git a/lib/Bitcode/Reader/BitcodeReader.cpp b/lib/Bitcode/Reader/BitcodeReader.cpp
> index e228b1d..1322edd 100644
> --- a/lib/Bitcode/Reader/BitcodeReader.cpp
> +++ b/lib/Bitcode/Reader/BitcodeReader.cpp
> @@ -33,6 +32,26 @@ enum {
> SWITCH_INST_MAGIC = 0x4B5 // May 2012 => 1205 => Hex
> };
>
> +BitcodeReader::BitcodeReader(MemoryBuffer * Buffer, LLVMContext & C)
clang-format?
> + : Context(C), TheModule(nullptr), Buffer(Buffer),
> + SeenValueSymbolTable(false), ValueList(C), MDValueList(C),
> + SeenFirstFunctionBody(false), UseRelativeIDs(false),
> + WillMaterializeAllForwardRefs(false) {
> diff --git a/tools/llvm-dis/llvm-dis.cpp b/tools/llvm-dis/llvm-dis.cpp
> index 072f636..3e21164 100644
> --- a/tools/llvm-dis/llvm-dis.cpp
> +++ b/tools/llvm-dis/llvm-dis.cpp
> @@ -135,12 +205,13 @@ int main(int argc, char **argv) {
> else
> DisplayFilename = InputFilename;
> ErrorOr<std::unique_ptr<Module>> MOrErr =
> - getStreamedBitcodeModule(DisplayFilename, Streamer, Context);
> + getStreamedBitcodeModule(DisplayFilename, std::move(Streamer), Context);
> if (std::error_code EC = MOrErr.getError())
> ErrorMessage = EC.message();
> else
> M = std::move(*MOrErr);
> if(M.get()) {
> + errs() << "foobar " << Foo.getMaxPos() << '\n';
Debug output?
> if (std::error_code EC = M->materializeAllPermanently()) {
> ErrorMessage = EC.message();
> M.reset();
More information about the llvm-dev
mailing list