[libcxx-commits] [PATCH] D148693: [libc++] Set correct size at the end of growing std::string

Nikolas Klauser via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Fri Apr 21 02:04:46 PDT 2023


philnik added a comment.

In D148693#4286435 <https://reviews.llvm.org/D148693#4286435>, @AdvenamTacet wrote:

>> Is there a test that can catch this issue?
>
> @ldionne `__grow_by` is private and every public function using it increases the size and eventually sets a new value. So I'm not sure how to create a test for it.
>
> I realized that it's incorrect when I tried to use `.size()` value to annotate container and it failed.
>
>> This changes the postconditions of a function in the dylib, which is an ABI break.
>
> @philnik  I would argue that it's a bug and should be fixed.
>
> https://github.com/llvm/llvm-project/blob/75196f8e72be3f18c5a831e23f385c4ae3eb62b5/libcxx/include/string#L703-L720
>
> Let's look at layout (ASL as example, but the same reasoning works with the second layout):
>
>   struct __long
>   {
>       pointer   __data_;
>       size_type __size_;
>       size_type __cap_ : sizeof(size_type) * CHAR_BIT - 1;
>       size_type __is_long_ : 1;
>   };
>   
>   struct __short
>   {
>       value_type __data_[__min_cap];
>       unsigned char __padding_[sizeof(value_type) - 1];
>       unsigned char __size_ : 7;
>       unsigned char __is_long_ : 1;
>   };
>
> If string is short, its size is in its last byte, rest of the object is its content.
> Then `__grow_by` is called and string is going to became long. The function sets data pointer and capacity (bytes at the beginning and the end, but middle bytes (size) are unchanged).
>
>   __set_long_pointer(__p);
>   __set_long_cap(__allocation.count);
>
> It means that size is set to "random value" (old content) as previously it was strings content (eg. it could be `0x4974277341537472` - `"It'sAStr" - and it will be its size now).
>
> I see no value in that kind of behavior and I think it may be changed.
>
> ---
>
> But let me know if it's //desired/cannot be changed//, I will modify D132769 <https://reviews.llvm.org/D132769> so `__grow_by` unpoisons memory for future elements already, then a call to `.size()` is not necessary.

I don't disagree that it's a bug, but if we started to rely on `__grow_by` setting the size anywhere (or the compiler removes redundant stores), it would result in broken code and unhappy users. Maybe we could make a new function that calls `__grow_by` which has the correct postconditions? We could then remove `__grow_by` from ABIv2 and add the new function. That way we have the better interface without and ABI break.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148693/new/

https://reviews.llvm.org/D148693



More information about the libcxx-commits mailing list