[PATCH] D115374: [NFC][clang] Return std::strings built from raw_string_ostreams more efficiently

Duncan P. N. Exon Smith via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Dec 8 17:36:26 PST 2021


dexonsmith added a comment.

In D115374#3181491 <https://reviews.llvm.org/D115374#3181491>, @logan-5 wrote:

> In D115374#3181383 <https://reviews.llvm.org/D115374#3181383>, @dexonsmith wrote:
>
>> I don't feel strongly, but IMO the code might be a bit harder to read/maintain with the explicit flush. I worry that it'd be easy to move the `flush()` away from the `return`. Not sure I'm right; could just be familiarity with `str()`.
>
> I definitely hear you. I don't really mind it personally, and I did it this way because I saw precedent in a couple spots (there's one on CompilerInvocation.cpp:653, several in clangd, etc.). I definitely see how it could be a little bit spooky though.

I haven't done the git-blame, but I somewhat suspect `str()` was added specifically as a convenience to make the flush+access/return a one-liner.

> I suppose the question then becomes whether to name it `take()` or `str() &&` for symmetry with C++20's `std::ostringstream`. (Also for the record, I agree that NRVOing some std::strings isn't going to make a giant difference; my opinion is simply that if we can get it, we might as well.)

I think `take()` would be clearer for use in LLVM code, despite the symmetry. Especially since the two stream libraries have significant differences. (Also, maybe I'm mis-remembering, but I think I've seen places where `OS.str()` is passed to some function when it's partially written; seems dangerous to silent start selecting a different overload. Could be I'm wrong though.)



================
Comment at: clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h:36
+    OS.flush();
+    return Str;
   }
----------------
logan-5 wrote:
> dexonsmith wrote:
> > logan-5 wrote:
> > > Quuxplusone wrote:
> > > > FWIW, it appears to me that in most (all?) of these cases, what's really wanted is not "a string //and// a stream" but rather "a stream that owns a string" (`std::ostringstream` or the LLVM-codebase equivalent thereof). Then the return can be `return std::move(OS).str();` — for `std::ostringstream`, this Does The Right Thing since C++20, and if LLVM had its own stringstream it could make it Do The Right Thing today.
> > > > https://en.cppreference.com/w/cpp/io/basic_ostringstream/str
> > > > 
> > > True enough. Although `return std::move(OS).str();` is still much harder to type than the less efficient `return OS.str();`, and it requires at minimum a move of the underlying string--whereas `return Str;` is the easiest of all to type, and it opens things up for NRVO. If (as I said in the patch summary) `raw_string_ostream` were changed to be guaranteed to not need flushing, `return Str;` would IMHO be cemented as the clear winner.
> > > 
> > > That said, you're clearly right that all these cases are semantically trying to do "a stream that owns a string", and it's clunky to execute with the existing APIs.
> > >  If (as I said in the patch summary) raw_string_ostream were changed to be guaranteed to not need flushing
> > 
> > This sounds like a one-line patch; might be better to just do it rather than having to churn all these things twice.
> > >  If (as I said in the patch summary) raw_string_ostream were changed to be guaranteed to not need flushing
> > 
> > This sounds like a one-line patch; might be better to just do it rather than having to churn all these things twice.
> 
> I guess this change kind of freaks me out. Currently you can call `SetBuffered()` on `raw_string_ostream` (though I don't know why you would...), which creates an intermediate buffer and then `flush()` syncs the buffer with the underlying `std::string&`. Removing that ability would be a breaking change, and I'm not sure how we could make it while being confident we're not breaking anything downstream. 
> 
> (On the other hand, you can call `SetBuffered()` on `raw_svector_ostream` too, whose documentation more or less says it doesn't support buffering. If I'm reading right, you get an assert failure in `~raw_ostream()` if you do.)
I don't think we need to be afraid. `raw_ostream` seems like a bit of a leaky abstraction, but IMO we need to trust that code doesn't arbitrarily call `SetBuffered()` on without full control. It's essentially never the right thing to do except from subclasses (where it could be `protected`) and maybe in unit tests for convenience. As you pointed, out this would also break `svector_stream`.

What I'd be more worried about is the micro-performance-penalty of std::string null-terminating with every write. But probably that's not particularly important either.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115374/new/

https://reviews.llvm.org/D115374



More information about the cfe-commits mailing list