Introduce StringBuilder utility around raw_string_ostream

David Blaikie dblaikie at gmail.com
Thu Jun 26 16:50:14 PDT 2014


On Thu, Jun 26, 2014 at 4:05 PM, Alp Toker <alp at nuanti.com> wrote:
>
> On 27/06/2014 01:42, David Blaikie wrote:
>>
>> On Thu, Jun 26, 2014 at 3:37 PM, Alp Toker <alp at nuanti.com> wrote:
>>>
>>> On 27/06/2014 00:48, David Blaikie wrote:
>>>>
>>>> On Wed, Jun 25, 2014 at 9:53 PM, David Blaikie <dblaikie at gmail.com>
>>>> wrote:
>>>>>
>>>>> On Wed, Jun 25, 2014 at 5:52 PM, Alp Toker <alp at nuanti.com> wrote:
>>>>>>
>>>>>> On 26/06/2014 03:16, Chandler Carruth wrote:
>>>>>>
>>>>>>> On Thu, Jun 26, 2014 at 2:09 AM, Alp Toker <alp at nuanti.com
>>>>>>> <mailto:alp at nuanti.com>> wrote:
>>>>>>>
>>>>>>>       I've made that change, plus a couple more updated uses, and
>>>>>>> landed
>>>>>>>       this in r211749.
>>>>>
>>>>> FWIW, perhaps I wasn't clear - but my "I'd like to loop back around
>>>>> with Chandler to nail down the design choice" was "I'm not signing off
>>>>> on this patch without Chandler", so the commit may've been a bit
>>>>> premature.
>>>>
>>>> Given this confusion, I think appropriate to back this patch out until
>>>> it's actually been signed off on.
>>>>
>>>> Sorry for my vague language/lack of clarity over incremental review
>>>> compared to sign-off.
>>>
>>>
>>> I don't agree there was any problem with review here. The patch got its
>>> OKs
>>> and looks greats,
>>
>> You asked me if it was OK to land. I gave some feedback and said I'd
>> like to wait. You interpreted that as sign-off. I did not intend it as
>> such. We seem to have failed to communicate and I'm trying to clarify
>> what I intended to communicate.
>>
>> I'm sorry for the miscommunication, but I really didn't mean to give
>> you the impression that this was ready to commit. Please revert it.
>
>
> There's obviously no way to revert this without losing the bug fixes.

Right - they can all be included together in whatever form this is
recommitted. No worries about that. (happens all the time here - given
we prefer fine grained commits (ideally so that sometimes they can be
reverted separately, or at least examined separately when looking
through commit history, etc) so sometimes a set of related patches
have to be reverted together. Not a problem)

> I've backed them out together temporarily in r211814.

Thanks!

> Could you state any concerns you have clearly?

Sure - sorry, just didn't want to confuse code review protocol with
the specific merits or caveats of this patch. Not even a question of
whether I had concerns or not, just that I hadn't actually signed off
on the patch. A simple misunderstanding. I'll try to be more clear in
the future.

> A critical feature of the patch is guaranteed safety. Where you see a
> "string_ostream" type you get an assurance that the string building
> operations are valid.

The difference here would just be "Where you see string_ostream
default constructed, you get an assurance that the string building
operations are valid", right? (it's just "if you use this constructor
that takes an externally-provided buffer, buyer beware" - tihs is true
of many types (eg: std::unique_ptr's raw ctor, compared to something
like make_unique)).

> Those assurances apply even when the builder is passed by reference. You can
> see the clarity this adds for example in LLVMCreateMessage() and a few other
> places on ToT.

The cleanup in LLVMCreateMessage and callers seems almost entirely
orthogonal to your new type. The same API change could be made even
with the old code:

  string s;
  raw_string_ostream stream(s);
  ...
  return LLVMCreateMessage(stream);

LLVMCreateMessage is still safe, clear, and correct using either API
because str(), even on raw_string_ostream as it is today, is still
well defined to flush and then give access to the string.

Or, with a single type, it'd just be:

  raw_string_ostream stream;

instead of

  string_ostream stream;

Or is there some other clarity benefit you were referring to?

(with the same LLVMCreateMessage change you made (though I personally
would have LLVMCreateMessage take a const std::string& - it seems more
appropriate for a variety of callers, even if the current callers all
happen to have raw_string_ostreams at the moment - indeed, perhaps
some of them should just be using string concatenation anyway))

> Importantly, there is no locality of construction so *the type becomes the
> documentation*.

Generally it's only important to know this distinction in the scope
where the type is declared (because it's that scope where the external
storage is provided) - once the stream is passed to some other API,
the only access to the string is likely going to be through
raw_string_ostream's "str()" member, which does the right thing.

> If we were to lump storage into raw_string_ostream, this strong safety
> indicated by the type system would disappear along with all the heap
> savings.

Heap savings can be reacquired where /really/ necessary by using
external storage, if necessary. By building in storage, that would
become the exception rather than the rule, and something we could more
closely scrutinize during code review.

> (which simply aren't negligible compared to some of the effort going into
> bit packing elsewhere in the codebase).

Apples and oranges, I suspect (& wouldn't mind some evidence to the
contrary before we proliferate more choices in data structure type on
the codebase - we have a lot, many for good reason, but there is a
cognitive/code review/etc load we pay by adding more). The bit packing
sort of stuff happens in data structures that there are often
thousands upon thousands of in a single run, or at a single point in
time so they have a significant impact on peak memory usage. I doubt
we have more than a handful of raw_string_ostreams in existence
simultaneously.

> If we mix semantics in the base, we'd basically just end up a raw string
> ostream that sometimes isn't raw,  even making the name of the type a
> little dubious.

(On naming: as Chandler mentioned, these are all raw streams - even
the new ones you're proposing, the "raw"ness is intended to
distinguish them from the C++ Standard stream libraries that have
streambufs and all that other stuff, not about whether they have
internal storage or not)

- David



More information about the llvm-commits mailing list