<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/131692>131692</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[libc++] {std, ranges}::copy fails to copy vector<bool> correctly with small storage types
</td>
</tr>
<tr>
<th>Labels</th>
<td>
libc++
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
winner245
</td>
</tr>
</table>
<pre>
The current implementation of `{std, ranges}::copy` does not correctly handle `vector<bool>` when the underlying storage type (`__storage_type`) is smaller than `int`, e.g., `unsigned char`, `unsigned short`, `uint8_t` and `uint16_t`.
##### Reproducer
A minimal reproducer is as follows ([Godbolt Link](https://godbolt.org/z/aoWThxbxr)):
```cpp
#include <limits>
#include <memory>
#include <vector>
#include <algorithm>
#include <cassert>
#include <cstdint>
#include <iostream>
template <typename T, typename Size, typename Difference>
class sized_allocator {
template <typename U, typename Sz, typename Diff>
friend class sized_allocator;
public:
using value_type = T;
using size_type = Size;
using difference_type = Difference;
using propagate_on_container_swap = std::true_type;
constexpr explicit sized_allocator(int d = 0) : data_(d) {}
template <typename U, typename Sz, typename Diff>
constexpr sized_allocator(const sized_allocator<U, Sz, Diff>& a) noexcept : data_(a.data_) {}
constexpr T* allocate(size_type n) {
if (n > max_size())
throw std::bad_array_new_length();
return std::allocator<T>().allocate(n);
}
constexpr void deallocate(T* p, size_type n) noexcept { std::allocator<T>().deallocate(p, n); }
constexpr size_type max_size() const noexcept {
return std::numeric_limits<size_type>::max() / sizeof(value_type);
}
private:
int data_;
constexpr friend bool operator==(const sized_allocator& a, const sized_allocator& b) {
return a.data_ == b.data_;
}
constexpr friend bool operator!=(const sized_allocator& a, const sized_allocator& b) {
return a.data_ != b.data_;
}
};
int main() {
using Alloc = sized_allocator<bool, std::uint8_t, std::int8_t>;
std::vector<bool, Alloc> in(8, false, Alloc(1));
std::vector<bool, Alloc> out(8, true, Alloc(1));
std::copy(in.begin(), in.begin() + 1, out.begin()); // out[0] = false
// We only assigned a single bit, but entire byte got zeroed!
for (std::size_t i = 0; i < out.size(); ++i)
std::cout << out[i];
std::cout << '\n';
}
```
In this example, a single bit from `vector<bool>` is copied and set to `false` using `std::copy`. However, due to incorrect behavior in `std::copy`, all bits in the whole storage word are zeroed.
##### Root Cause Analysis
The root cause in this specific example lies in the following problematic bit mask computation in `std::copy`:
https://github.com/llvm/llvm-project/blob/584f8cc30554c89fdd27cc9e527416a6e4e2cc45/libcxx/include/__algorithm/copy.h#L56
The evaluation of `~__storage_type(0)` is first integral-promoted to `-1`, which is subsequently left shifted and right-shifted. However, left-shifting a negative value leads to UB before C++20 [1], and right-shifting a negative value results in sign-bit extension since C++20 [2]. Due to sign-bit extension, all bits in the mask are set to `1`, making the mask incorrect and leading to erroneous copying behavior.
It appears that Clang exhibits consistent incorrect behavior in `std::copy` even before C++20. I believe this is because Clang already performed sign-bit extension in right-shift operations even before C++20.
There are several other bitwise operations in `std::copy` that are similarly flawed and need to be fixed.
[1] [[expr.shift]/3](https://eel.is/c++draft/expr.shift#note-2)
> Right-shift on signed integral types is an arithmetic right shift, which performs sign-extension[.](https://eel.is/c++draft/expr.shift#3.sentence-3)
[2] [cppreference](https://en.cppreference.com/w/cpp/language/operator_arithmetic#Bitwise_shift_operators)
> For negative a, the behavior of a << b is undefined. (until C++20)
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJy0WMuO67jRfhr2pnAMm_KtF72w3eP_H2BWkx7M0qCkksUcilRIypezyLMHRVKW3HbPTBKkIUBtklX86qsLixLOyaNGfGOLLVu8v4jO18a-naXWaPl88ZKb8vr2USMUnbWoPcimVdig9sJLo8FUwJZTtto6XzK-Ayv0ER1bvbNsw7JNYdorW06hNOhAGw-FsRYLr65QC10qJOkTFt5Ylu1yYxTLfiKBc40afI3Q6RKtukp9BOeNFUcEf20RGF-z5fRwSIMHGiQk_BWkA9cIpdCCr4WmLaT2YXIHODlO6M2W004H40soamHT9HjY1cb60bjUfn2gARC67EdmyzA0ATbdpIdn4wd-xdaasivQsulmA43UshEK7G2YAAsHlVHKnF2wbLH9P1PmRnn4RervbPHO-Lr2vnVEK98zvj_G-YmxR8b3PxjfC_P7R33JL5bxV3qyHs9yGp-ibSM8qQvVlQgs2ynZSO-I9M8zDTbGXp_N9P56nBHqaKz0dfNsshDOofVPp5wvyUVPpqRx3qLoNU43HptWCR8myelaNAgf5KPbr7_JH3g38C6rCi3qAqOaQgnnwMkfWB6EUqYQ3lhgqy2bbgCe7vDb_Q4_HvRHzQCVlahLeLoFy7bRiLbLlSyiiwA6R-F9EqqLcQwPfyx7h48o3S8n1V-sDssDCXcS5Y2FRzmSGLM0lmutacVReDwYfSiM9kJqtAd3Fm2Qo9QP2e5tMuBmJ0BhtPN4aS3gpVWykP6BFL6W2kMZdIUEZtkGSuHFgfF1GQZWWyopSeV_558B0COOMPfotF1QHnX2qvgSBEHTBi8Ftv4OtJik_z5DHzb_YHwDaQ9kfD24U_diJAAgK6oIGlj2EzTicnAhuNcpyeMaAF9bcx48kYvyIKwV14PG80GhPvo6CUXXWvSd1YPA2NqPYB8tnowA6kEYnlp0MrKEEkciwcaWWPtk3UDaavunGO5UBm0JynMYw1b3bMUld1sn9j5zobsGrSwOfW3c3VQSqLCkEZeklfF92NJUjK-HFB7IuoFsrTyRESnpQ8yHKHmWLKmK0IkIpkUbaXmn56swjRG5gy9n8_vASmanWIWoHfLJCFTv6T9Fxmf_S2RB-3NkodNI_BGjjZC698xqC6MqtqENY716yO_QeFCY9iHQn_XjsTREIRAR3GbuGxi-i3tRwgYwaxqqhHI4zPH1rD-mg7K_oMt0vldGhfZrXYOy0H5ReZ3keOyJIcH7EWB8CzMaN52_XxrzLHQcAcBiO2WL90BjtChFblryO4LR6gqpryxBAJGvEHIZ2Mw7D6i9tAj51SMcjYcfaA2W5OVwftJJzNc3G2LygUzHQ7YN_0aoo1oYYG4Z38qhLI546KhA75IcW2wl9VSPdA3LGF-xxU7TK8ZXn8a3diokdxj6mVpV6QAvgrpjsnNsN1TWNF81utJBYVpJVOkSHHrwhtZGdpfTFL1sOb33KrWc_2_OeEJL-5UdkqDUqcGGHGtxksaC1M-EA0alCJ6jJdRqn2uj8NZln40tQVhM7vmjBtcYDzvROYSNFurqpIsL6d5gabIIkzLR5FosZCWLni9QEm8gYhec2o5cYSO8LAKLjXDfoTBN26W7xxeWZb1bPjXM0tddPilMw_heqVP_-tZa83csPOP7XJmc8f1iPa_WRZFNF4t5sX6typKviuIVF3w1ny3FEufIi2K-IAUyLy4XxvepY2V8fzgMXTDfE6hJzXj2y2J5I5B4QTorxneof366y_A19UIpRCppnacjA49WKILcGI9lipVvs-TRcy2LOtx_utzhPzrUdNNSWHlwtax8ijIrj7X_lkbuwoiWxgnygACNR-HlCWNzCgpF6WjT37aQY2Uswi5mHZ8CW2xn4aay-7zJU10WXadi8FGt-EYuxotH7YgTJ3XxSTlni_cJvMdAfxR5FtEhZCiGh7zqqWrEd4J1WzVkDoEnQ8O0AbTWaDRdyNNwD-1Tq0-Jnz2ItkVhHd03PeyU0EfASy0DFjr1pPPh8vxX0xPwhPqB4gn8DDkqiSeMmSQd5BiTK24qlEVRXqFFWxnb0CX2kVupx85Jh7g02n216zhwLSZCT2iFAuNrtMT5WTocq_rCrkBQUCAbqYRVV6iUOKfA1BhjOkeo5AXLyVBzppsUXxA-VWypF5kEA0LM7bNnl2RENZGO0jAaU1pRUaKPhHmmjcdvPJ4adM7-OuYmBieWt9wLl4rAvNAgQpojVajAKCSdfSYmN7johSFUF9vJfww3mzjUnm5p37IEOnDDEzdF21rs73HPNtGT8ZJUEM-0a9tSSRP62IkjlbK-vTsMdjKebaOzDwHQoV_jBizZT7A3dsj30PtRot2C3lQg-oM2Jy47XWIldThm-LrTXqohAKPml_ItK1-zV_GCb7PVnPNF9rqevdRvi7lYZzPB1zkvlrwss1m1qrLZHLO5mK5nxYt841O-mGazFc_4ks8mZVbm5Zwv89W8KF9XKzafYiOkmtB5MDH2-CKd6_Btls2Wr_xFiRyVC5_IOKeKH4ExTpS_2LdwiuTd0bH5VEnn3aDHS6_Cx7WRGLnpj76XQSWkClU2_PrcNIw-oZ2lr-OnrrtvY-6ls-rt3z79gs0Ufcns0xv_VwAAAP__PgVR8Q">