[libcxx-commits] [PATCH] D114395: [libc++] Fix the return value of max_size()

Mon Nov 29 13:37:01 PST 2021

aaronpuchert added a comment.

In D114395#3159050 <https://reviews.llvm.org/D114395#3159050>, @Quuxplusone wrote:

> FWIW, I'm still of the (very firmly held) opinion that this patch is unnecessary. `max_size()` is //always// just an upper bound on the size of the sequence. Lowering that upper bound
>
> - might introduce actual factual errors, e.g. if we get the bound wrong — whereas `size_t(-1)` can never be wrong

Which kinds of errors would you be worried about? The `sizeof` division itself isn't so complicated that I'd be worried about it being wrong.

> - is no closer to being "accurate," because it is just as impossible //in practice// to have a `string_view` of length 9223372036854775807 or 4611686018427387903, as to have a `string_view` of length 18446744073709551615. These are all equally lying numbers, and there's no point pretending that any one of them is "more useful" or "more accurate" than any other.

It's a bit unfortunate though that a `basic_string_view<char16_t>` of size 9223372036854775808 has a memory size of 0 bytes on a 64-bit system. (Or does it? The problem is that unsigned overflow is well-defined.)

Sure, the other numbers are also highly suspicious, but there are different kinds of wrong.

================
Comment at: libcxx/include/string_view:334
     _LIBCPP_CONSTEXPR _LIBCPP_INLINE_VISIBILITY
-    size_type max_size() const _NOEXCEPT { return numeric_limits<size_type>::max(); }
+    size_type max_size() const _NOEXCEPT { return (numeric_limits<size_type>::max() - sizeof(*this)) / sizeof(value_type); }

----------------
ldionne wrote:
> mclow.lists wrote:
> > Quuxplusone wrote:
> > > (1) This seems unnecessarily fiddly; I don't really see anything wrong with the original.
> > > (2) If we're going to do this, let's add a line break somewhere.
> > > (3) Since `sv.size()` cannot physically return anything bigger than `PTRDIFF_MAX`, you probably want to use `ptrdiff_t` instead of `size_type` here... or maybe return `std::min(PTRDIFF_MAX, SIZE_MAX / sizeof(value_type))`? But again, this is needlessly fiddly and I see nothing wrong with just returning `SIZE_MAX` unconditionally. Literally nobody should ever be calling this method.
> > > 
> > > Do any of the other methods (e.g. the constructor) need to do something special to preserve the postcondition that `sv.size() <= sv.max_size()`? Before this change, the postcondition was vacuously true; now, it could conceivably be false, and maybe we need to check for that?
> > My preference is `return (numeric_limits<size_type>::max() / sizeof(value_type)` since this is the max number of elements that a `string_view` can hold.
> > 
> > As for "literally nobody should be calling this method", I agree, with the exception of people writing test suites.
> @Quuxplusone 
> 
> > (1) This seems unnecessarily fiddly; I don't really see anything wrong with the original.
> 
> If the character type is more than 1 byte long, the original is wrong.
> 
> @mclow.lists 
> 
> > My preference is return `(numeric_limits<size_type>::max() / sizeof(value_type)` since this is the max number of elements that a string_view can hold.
> 
> Yeah, I agree now. I was using `- sizeof(*this)` as some sort of poor man's way of removing the current `string_view`'s memory usage from the lot. However, this is entirely misguided -- re-reading the definition of `size_t`:
> 
> > `std::size_t` can store the maximum size of a theoretically possible object of any type (including array).
> 
> It doesn't have anything to do with the amount of memory usable on a system, it's only the maximum size of a single object. So yeah, I agree with your suggested definition.
Isn't the point of `max_size` to return the maximal size so that address or memory consumption computation doesn't wrap around or overflow? So I think this change is correct. Surely `max_size` is more relevant in (owning) containers, but staying below should guarantee that `size() * sizeof(value_type)` doesn't overflow.

> Do any of the other methods (e.g. the constructor) need to do something special to preserve the postcondition that `sv.size() <= sv.max_size()`? Before this change, the postcondition was vacuously true; now, it could conceivably be false, and maybe we need to check for that?

One might add an assertion to the `basic_string_view(const _CharT*, size_type)` constructor, but it'll probably not find anything relevant.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114395/new/

https://reviews.llvm.org/D114395