[libcxx-commits] [PATCH] D102727: Use precise growth rather than amortized growth in std::string::resize/__resize_default_init.
Louis Dionne via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Tue Aug 10 13:45:34 PDT 2021
ldionne requested changes to this revision.
ldionne added a comment.
This revision now requires changes to proceed.
Thanks a lot for the diligent reply and explanation you added to the patch summary. I still wasn't sure about this so I asked the question on the LWG mailing list, which basically consists of other implementers and experts on the library parts of the Standard -- all folks with a lot of experience and usually great judgement.
I mainly wanted to know whether there was unwritten intent in the spec that `resize` behaves like `push_back` and `append` with respect to geometric growth, and also to know whether other implementers had considered or done the change proposed here in their own implementation. The feedback I got is in line with my initial intuition and the reason why I've been so cautious with this change. Note that technically, this change does conform to the Standard, so I'm not pushing back on that basis. Instead, I'm pushing back based on the following reasons, which is a mix of the feedback I got on the LWG thread and my own opinion:
1. The Standard uses the language "appends" to define `resize` (http://eel.is/c++draft/string.capacity#5.2), which is consistent with how `push_back` is defined. Those should behave the same for consistency, and it can be seen as a lack of clarity that the Standard doesn't currently specify a complexity guarantee for `resize`. From a user perspective, the library is easier to learn and use if `resize`, `push_back`, `append` and all the other ways to append (`emplace_back`, etc.) behave similarly, with the only knob to control the capacity being `.reserve()`.
2. Making this change would also make `basic_string` behave unlike `vector`, which adds to the inconsistency. I think we should be consistent across container types.
3. People do use patterns like StringBuilder and are used to `resize` behaving in amortized O(N). While regressions to O(N^2) may be rare, the library's primary role is to be suitable to a general audience more than being aligned with any single code base's needs. Even though I understand that the memory improvements would probably benefit other code bases as well, I think the possibility for regressions (and especially nasty regressions at that) outweighs that potential benefit.
4. Other surveyed implementations all use exponential growth. Diverging from this in libc++ would create implementation divergence, which is usually (although not always) bad. In this case, I think it may hurt portability.
I understand this isn't the resolution you're looking for and I am thankful for the diligence you've put into explaining the change and measuring it. However, someone has to make a judgement call somewhere, and in this case, I'm going with consistency-in-the-API and conservatism instead of the memory consumption gains this could provide.
I will also follow up with LWG to see whether we want to make it a hard requirement for `resize` to behave consistently with `push_back` and friends, which would provide a portability benefit (users would not have to worry about their code being quadratic on some implementations).
One way forward that was suggested on the LWG thread was to add new methods `resize_exactly` and `reserve_exactly` that would allow specifying the exact capacity of the container. Inside Google, you could potentially change all calls to `resize` into calls to `resize_exactly`, which would have the same effect as this patch. Of course, that has to go through the Committee first, but I think it would have chances of success.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D102727/new/
https://reviews.llvm.org/D102727
More information about the libcxx-commits
mailing list