[PATCH] D117244: [Bitcode] [ThinLTO] Add a new bitcode module record for THINLTO_INDEX_FLAG

Mingming Liu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 13 16:10:23 PST 2022


luna added inline comments.


================
Comment at: llvm/tools/llvm-dis/llvm-dis.cpp:194-204
       BitcodeLTOInfo LTOInfo = ExitOnErr(MB.getLTOInfo());
+
+      std::unique_ptr<Module> M;
+      if ((!DumpThinLTOIndexOnly) && (!LTOInfo.OnlyHasThinLTOIndex)) {
+        M = ExitOnErr(
+            MB.getLazyModule(Context, MaterializeMetadata, SetImporting));
+        if (MaterializeMetadata)
----------------
luna wrote:
> (Just a message before fixing the failed test)
> 
> In pre-merge test, `llvm-dis` parsing behavior changed for a bitcode that's invalid itself; the behavior changed from parse error to crash (test expects `llvm-dis` to return parse error while now `llvm-dis` crashes).
> 
> After a quick test, this is caused by the change of read order here;
> - before this patch, `getLazyModule` will exit-on-err so `getLTOInfo` won't be executed
> - after this patch, `getLTOInfo` is executed first, and crashes upon invalid abbrev ID around https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Bitstream/BitstreamReader.h#L528
> 
> I'll see how to solve this without changing test expectation from `not` to `crash` (i.e., `llvm-dis` still reports error rather than crash upon invalid bitcode).
> 
> The relevant checker line in `invalid.test` is `/var/lib/buildkite-agent/builds/llvm-project/build/bin/llvm-dis -disable-output /var/lib/buildkite-agent/builds/llvm-project/llvm/test/Bitcode/Inputs/invalid-pointer-element-type.bc `
To add more context on `getLazyModule` and `getLTOInfo` and  regarding how a bitcode is read 
      a) `getLazyModule` will invoke `BitcodeReader::parseModule`, which parses each subblock and exit-on-err when invalid pointer is read [1]; this is also where `invalid record` is returned when non-index-reader is used to parse minimized lto code.

      b) `getLTOInfo` [2] has the equivalent loop to iterate blocks and subblocks, but skips uninteresting subblocks. As a result the invalid block `TYPE_BLOCK_ID_NEW` is read in a skipping fashion (while it's detected if parsed), so the read action right after `TYPE_BLOCK_ID_NEW` is in a sense undefined and triggered `invalid abbrev id` [3] in this case.

This means, `llvm-dis` gives more clear error log by parsing first and tell which section is not correct; if `llvm-dis` tries to read LTO info before parsing, and reading LTO info doesn't parse, the error log is less helpful than error log from parsing.

On the other hand, if a wrong reader is used during parsing, `llvm-dis` gives false alarms (e.g., `llvm-dis` cannot read minimized thinlto bitcode now)

>From this perspective, it seems the right for `llvm-dis` to first parse and detect error; two options at the top of my head to solve this

  -In `getLazyModule`, use the read-and-fail approach (try common reader first, fallback to the other reader, and returns error if the fallback doesn't work); the index-flag in module block is probably not necessary in this option to solve `llvm-dis` issue.
   
  - Add another flavor of `getLTOInfo` that does parsing; but the problem of choosing a right reader still exists.

Maybe I should book a chat to discuss this further, but this patch is not in a hurry and not in the critical path anyway.

btw verified the loop to read module block is the same between `getLTOInfo` and `getLazyModule` implementation by printing names out (`getLazyModule` returns error on the 3rd entry when parsing it, `getLTOInfo` skips 3rd entry and fails on the 4th one).

[1] https://github.com/llvm/llvm-project/blob/d2cc6c2d0c2f8a6e272110416a3fd579ed5a3ac1/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L3657
[2] https://github.com/llvm/llvm-project/blob/d2cc6c2d0c2f8a6e272110416a3fd579ed5a3ac1/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L6982
[3] https://github.com/llvm/llvm-project/blob/6a1f50b84ae8f8a8087fcdbe5f27dae8c76878f1/llvm/include/llvm/Bitstream/BitstreamReader.h#L528





Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117244/new/

https://reviews.llvm.org/D117244



More information about the llvm-commits mailing list