[libcxx-commits] [PATCH] D102727: Use precise growth rather than amortized growth in std::string::resize/__resize_default_init.

Evan Brown via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Mon Jul 19 15:06:11 PDT 2021


ezbr added a comment.

In D102727#2880396 <https://reviews.llvm.org/D102727#2880396>, @ldionne wrote:

> I like the way this change is done better now, but I'm still not sure we want/can do this -- I have a few questions I need the answer to before this LGTM. Sorry if that sounds like I'm trying to poke holes in your optimization, but doing my due diligence requires me to try for a type as important as `std::string` and a method as widely used as `resize()`. We really don't want to get this wrong.
>
> 1. Does this change how iterators are invalidated?

Yes, I think this change would potentially result in iterators being invalidated more frequently. `__invalidate_all_iterators()` is called when we reallocate the buffer in `__grow_by` here <https://github.com/llvm/llvm-project/blob/2ff5a56e1ab2a95c36d3c5d2bef7c585125718ae/libcxx/include/string#L2289> and in `__shrink_or_extend` here <https://github.com/llvm/llvm-project/blob/2ff5a56e1ab2a95c36d3c5d2bef7c585125718ae/libcxx/include/string#L3383>. IIUC, if a user repeatedly uses `resize` to grow a string by a small constant size increment (e.g. `s.resize(s.size()+1)`) from size 0 to size N, we will get O(N) buffer reallocations in the new code whereas in the current code, we have O(log(N)) buffer reallocations.

> 2. http://eel.is/c++draft/string.capacity#5.2 says: "If `n > size()`, appends `n - size()` copies of `c`". Now, `std::string::append` has the strong exception guarantee when an exception is thrown, which means that `std::string::resize` also does when `n > size()`. Does your change modify that (please explain)?

>From https://en.cppreference.com/w/cpp/string/basic_string/resize, `resize()` also has the strong exception guarantee since C++11 so we can't violate that here. IIUC, we don't because the allocation here <https://github.com/llvm/llvm-project/blob/2ff5a56e1ab2a95c36d3c5d2bef7c585125718ae/libcxx/include/string#L3348> is the only new place that can throw (we always follow that branch of `__shrink_or_extend`), and the modification calls (i.e. `__set_long_cap`, etc.) are all after that at the end of the function.

> 3. Can you explain what are the cases where this change introduced quadratic CPU usage? I assume this is whenever someone was calling `resize()` in a loop with a gradually larger size: previously, the code would have reserved exponentially so we would have reallocated very rarely, but with your change, we will reallocate every time.

Yes, that's right. Added to the change description.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102727/new/

https://reviews.llvm.org/D102727



More information about the libcxx-commits mailing list