[cfe-dev] libc++: max_size() of a std::vector

Sun Feb 15 14:19:55 PST 2015

Hi Csaba,

I know that the type of the objet returned by the .size() method is std::size_t which is an unsigned integer of n bits (where n = 32 on x86 and n = 64 on x86-64). My claim is that libstdc++ (and every standard library that I know of) runs into undefined behaviour for vectors of size >= 2^(n-1) when the size method is called.

To make things concrete, let’s suppose that:
- You are on a 32 bits system (n = 32)
- You construct a std::vector<char> of size 2^31, which represents 2GB of memory. We assume that this memory is available so the constructor does not throw any exception
- You call the .size() method on that object

Here is the code available in the libstdc++ that comes with gcc 4.9.2

      size_type
      size() const _GLIBCXX_NOEXCEPT
      { return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }

As you can see, it takes the difference of the pointers this->_M_impl._M_finish and this->_M_impl._M_start. The second one points to the first element of the array and the first one points to the one after the last element of the array. Now, if you look at the C++11 standard, 5.7 (Additive operators), point 6, about the difference of two pointers:

"When two pointers to elements of the same array object are subtracted, the result is the difference of the subscripts of the two array elements. The type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined as std::ptrdiff_t in the <cstddef> header (18.2). As with any other arithmetic overflow, if the result does not fit in the space provided, the behavior is undefined….”

As you can see, the difference of 2 pointers is of type std::ptrdiff_t which is a signed integer of n bits. Therefore, the maximum value it can store is 2^(n-1) - 1. In our example, 2^31 is too big for std::ptrdiff_t, and the standard says that this is undefined behaviour. The fact that this number is casted to std::size_t after that operation is irrelevant.

François

> On 15 Feb 2015, at 21:52, Csaba Raduly <rcsaba at gmail.com> wrote:
> 
> Hi Francois,
> The return type of size() is size_t, which is unsigned, so it can
> represent values up to 2^n-1.
> 
> http://en.cppreference.com/w/cpp/container/vector/max_size says:
> 
> "This value is typically equal to
> std::numeric_limits<size_type>::max(), and reflects the theoretical
> limit on the size of the container. "
> 
> Csaba

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150215/0b5e3dc5/attachment.html>