[PATCH] D63115: [llvm-readobj] Fix output interleaving issue caused by using multiple streams at the same time.

Wed Jun 12 02:10:41 PDT 2019

jhenderson added a comment.

By the way, since you are touching COFF and Mach-O code, your test should probably be run on a file with each of those formats too.

================
Comment at: llvm/test/tools/llvm-readobj/check-output-order.test:7-9
+## TODO: Add a --all-debug option for llvm-readobj to turn on literally every
+##       option, then copy the output of `llvm-readobj --all-debug` and paste
+##       it into the CHECK line here.
----------------
ychen wrote:
> rupprecht wrote:
> > There is an `--all` option that turns on: `--file-headers, --program-headers, --section-headers, --symbols, --relocations, --dynamic-table, --notes, --version-info, --unwind, --section-groups and --elf-hash-histogram`
> > 
> > Is that what you want here?
> > 
> > That said, checking the full content of `--all` in a test may be too verbose. Maybe just check a few lines from each section; enough to make sure they aren't interleaved.
> Totally agree that checking for all output in the test file is too much. Comments updated.
> 
> `--all ` does not contain every possible piece of information. If we really want to check the order is the same as printing order (for all possible output) in the code for robustness, I think `--all-debug` is still needed.
--all can be combined with other options manually, if we are concerned with getting everything. I'm not sure how worth it it is to check every combination, but I certainly won't argue against somebody doing the work if they feel there's value.

When it comes to checking, you can just check the last line or two of the previous block, followed by the first couple of lines of the next block. If they are interleaved, one or other of the checks will fail.

================
Comment at: llvm/tools/llvm-readobj/ELFDumper.cpp:391-392
   GNUStyle(ScopedPrinter &W, ELFDumper<ELFT> *Dumper)
-      : DumpStyle<ELFT>(Dumper), OS(W.getOStream()) {}
+      : DumpStyle<ELFT>(Dumper),
+        OS(static_cast<formatted_raw_ostream&>(W.getOStream())) {}

----------------
ychen wrote:
> rupprecht wrote:
> > ychen wrote:
> > > rupprecht wrote:
> > > > ychen wrote:
> > > > > rupprecht wrote:
> > > > > > I don't understand this part of the change. `W.getOStream()` is a `raw_ostream` and can't be casted to a subclass. What this is doing is strange -- I'm not sure how it compiles.
> > > > > > 
> > > > > > I don't think the change to this file is actually necessary, is it?
> > > > > > can't be casted to a subclass.
> > > > > Why not if we know for a fact that the referenced object is indeed that subclass?
> > > > > 
> > > > > > What this is doing is strange -- I'm not sure how it compiles.
> > > > > I could be wrong but I think it is because there is no compiler checking for static_cast.
> > > > > 
> > > > > >I don't think the change to this file is actually necessary, is it?
> > > > > If not, `GNUStyle<>::OS` is another output stream that has different buffering state than the stream referenced by `W.getOStream()` here. (See formatted_raw_ostream::setStream)
> > > > > If not, GNUStyle<>::OS is another output stream
> > > > It looks like `formatted_raw_ostream` just wraps the stream passed in; it doesn't create another stream
> > > > 
> > > > I don't think what's happening is what appears to be happening. It looks like a cast. But I think it's creating a temporary reference that immediately leaves the stack and causes heavy UB here.
> > > > 
> > > > Have you run unit tests with this part of the change? There are tons of crashes for `check-llvm-tools-llvm-readobj` with just this part of the change.
> > > Ran both` check-all` and `check-llvm-tools-llvm-readobj`, did not see any regression ... (based on master b341d305a4cef8e75f520d104cd09b0fcfdc8bba)
> > > 
> > > > It looks like formatted_raw_ostream just wraps the stream passed in;
> > > Agree. And formatted_raw_ostream is a stream by itself even without wrapping the stream passed in.
> > > 
> > > > it doesn't create another stream
> > > I'm not sure. ` formatted_raw_ostream` is a subclass of `raw_ostream`. Why is it not another stream?
> > > Ran both` check-all` and check-llvm-tools-llvm-readobj, did not see any regression ... (based on master b341d305a4cef8e75f520d104cd09b0fcfdc8bba)
> > I only get test failures when I have just this file changed. When I apply the whole patch, the test failures go away.
> > A coworker pointed me to the rules here: http://eel.is/c++draft/expr.static.cast#2
> > 
> > The tl;dr is that this is valid //if// it's always true that `W.getOStream()` here is always a `formatted_raw_ostream`. If it's not, then you get UB (as evidenced by crashes when running unit tests). The rest of this patch updates all the places so that it is `formatted_raw_ostream` everywhere, hence why you don't see any crashes.
> > 
> > So in that sense, this patch is fine, but IMO it's a very fragile contract that we shouldn't rely on -- there's 4 layers of indirection between where `ScopedPrinter` is created (and now initialized w/ `fouts()` instead of `outs()`). Would it be possible, for instance, to instead change `ScopedPrinter` to take a `formatted_raw_ostream` instead of a `raw_ostream`? Or maybe insert calls to `flush()` at appropriate places (whenever switching between GNU vs LLVM style printing)?
> The contract could be checked by adding `assert(&W.getOStream() == &llvm::fouts())`.
> 
> Relying on flush() and having more than one stream outputting sounds more fragile to me. It needs tracking more locations in the source code. 
> 
> >change `ScopedPrinter` to take a formatted_raw_ostream
> This could serve the purpose here. But `ScopedPrinter` should not care about if the stream is formatted or not. 
The `OS` here should definitely be a reference, not a copy since the buffering for streams is local to the instance, and not attached to the main stream itself that everything wraps.

Is there anything stopping GNUStyle having a `raw_ostream &` and letting virtual function handling do the work of figuring out the appropriate thing to call? That would solve all these issues.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63115/new/

https://reviews.llvm.org/D63115