[cfe-dev] libc++: max_size() of a std::vector

Tue Feb 17 15:22:53 PST 2015

Hi Mikael,

Thanks for the information. I did not realise that as I am quite new to C++. I am coming from a Fortran background which is why indices mean so much to me :-)

For the bug report, it turns out that I’ve done more tests. For std::vector<char>
- libc++ version 3.5 has been fixed and max_size() returns PTRDIFF_MAX
- libstdc++ version 4.9.2 is buggy and returns SIZE_MAX
- Visual Studio 2013 is buggy and returns SIZE_MAX
So libc++ is safe and I’ll fill a bug report against the other standard libraries.

>From the very beginning, my feeling is that this choice of using std::size_t for indices is a *very* bad decision for various reasons: comparing signed and unsigned integers is like shooting yourself in the foot, compilers can’t optimize anything with unsigned integers because it must assume modulo 2^n arithmetic, etc. The only explanation I had for this choice was that in the old days of 16 bit computers 1 bit was a big difference in between std::size_t and std::ptrditt_t. Now, even this historical reason is falling apart.

Best regards,
François

> On 17 Feb 2015, at 22:55, Mikael Persson <mikael.s.persson at gmail.com> wrote:
> 
> Hi,
> 
> I think this problem is even worse than you suspect François. Even if the static-cast to size_t has the "expected result" (that the unsigned value turns out correct even if it went through an overflowing operation on the ptrdiff_t), which I agree will probably happen on nearly all platforms... this is not the worst problem.
> 
> The worst problem is that the max-size function is used internally by the vector to determine if an increase in capacity is possible. This means that the vector would accept a resize operation that makes the size exceed PTRDIFF_MAX, and at that point, if you use the iterators in any algorithm or code that takes a difference between them (e.g., std::sort), you will get a negative difference_type value (overflown and still interpreted as a signed value) between those iterators, which will lead to complete disaster on all but the most pedantic / defensive code out there (how many algorithms do you think check if the (last - first) difference is negative or overflows? probably very few).
> 
> The real issue is that if max-size is supposed to regulate how large a vector can be in practice (and that's exactly what the standard requires this value to represent), then resizing a vector to that size should produce a "perfectly good" vector in the sense that all operations with it (incl. algorithms on random-access iterators obtained from that vector) should be well-behaved. And if max-size is allowed to exceed PTRDIFF_MAX, this is simply not the case, because nearly everything you could do with a vector larger than PTRDIFF_MAX is undefined behaviour. 
> 
> Cheers,
> Mikael.