[PATCH] D120781: [Bitcode] materialize Functions early when BlockAddress taken
Duncan P. N. Exon Smith via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 7 16:28:47 PDT 2022
dexonsmith added a comment.
In D120781#3437465 <https://reviews.llvm.org/D120781#3437465>, @nickdesaulniers wrote:
> there's some tests failing where the "use list" (???) seems to be incorrect
I suspect these are "use-list order" tests, which confirm that we can round-trip to bitcode without changing the use-list order.
Quick summary:
- Every `Value` has an intrusive linked list of its `Use`s -- i.e., where it's used as an operand.
- Some algorithms walk a `Value`'s use-list, and their output can be influenced by the order that the `Use`s are listed.
- This order is deterministic, but arbitrary somewhat arbitrary.
- Bitcode serialization has an optional feature to serialize / recover this use-list order. This is important for reproducing some (rare) bugs, or and for guaranteeing that serializing/deserializing has no side effects.
- To minimize overhead, the algorithm only stores diffs vs. the "natural" use-lists you'd get by running `materializeAll()`. (Side note: this feature is optional because the storage overhead is a bit too high. If we could guarantee that no one looked at use-lists of pure constants like `i1 0`, `ptr null`, and `i32 0`, it would likely be cheap enough to store all the time.)
(Incidentally, that feature is the only reason I know anything about `blockaddress`...)
Anyway, if you stop calling `materialize()` from inside `parseFunctionBody()`, you'll stop changing the use-list order, and these tests should stop bothering you.
> some blockaddresses are being deserialized with the basic block name changed to a number from a string that I don't yet understand.
I'm guessing it's not safe to call `materialize()` from `parseFunctionBody()`; it's maybe not re-entrant, and something is getting corrupted?
Have a look at how blockaddress forward references are handled; if you follow a similar path for these discovered back-edges the corruption might go away.
================
Comment at: llvm/lib/Bitcode/Reader/BitcodeReader.cpp:4324-4343
+ // When we have the RARE case of a BlockAddress Constant that is not
+ // scoped to the Function it refers to, we need to conservatively
+ // materialize the referred to Function, regardless of whether or not
+ // that Function will ultimately be linked, otherwise users of
+ // BitcodeReader might start splicing out Function bodies such that we
+ // might no longer be able to materialize the BlockAddress since the
+ // BasicBlock (and entire body of the Function) the BlockAddress refers
----------------
I think it'd be better to save these in a list to iterate through after finishing materializing whatever's already "scheduled".
I think the right place to run the code is probably in `materializeForwardReferencedFunctions()`, which is already doing stuff for block addresses (in the other direction!). This is called from two places:
- `BitcodeModule::getModuleImpl()`, which only calls it if lazy-loading -- i.e., if NOT calling `materializeAll()`. I am guessing that lazy-loading needs this because block addresses can be referenced by something that isn't lazily loaded (not sure what that is; maybe globals or some metadata?).
- `BitcodeReader::materialize()`, at the end.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D120781/new/
https://reviews.llvm.org/D120781
More information about the llvm-commits
mailing list