[libcxx-commits] [libcxx] [libc++] Remove unnecessary division and modulo operations in bitset (PR #121312)

Thu Mar 20 06:46:08 PDT 2025

================
@@ -465,10 +465,10 @@ protected:
     return __const_reference(&__first_, __storage_type(1) << __pos);
   }
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 __iterator __make_iter(size_t __pos) _NOEXCEPT {
-    return __iterator(&__first_ + __pos / __bits_per_word, __pos % __bits_per_word);
+    return __iterator(&__first_, __pos);
----------------
winner245 wrote:

I found that I was slightly wrong in this. In a single word implementation (`__bitset<1, _Size>`), normally (when we read or write individual bits) we have `__pos < __bits_per_word` (strictly less). But for operations that involve calls to some standard algorithms with a start and past-the-end iterators (e.g., `bitset::operator==` calls `std::equal`, `bitset::operator<<=, operator>>=` call `std::{copy, copy_backward, fill_n}`,  and `bitset::count` calls `std::count`), we must allow the `__pos == __bits_per_word` case to give a past-the-end iterator for these algorithms, i.e., `__bitset<1, _Size>::__make_iter(/*__pos = */_Size)`. 

So my previous version without the `_LIBCPP_ASSERT_INTERNAL` check should have caused some problem. However, we didn't seen any test failure and all CI passed. This is then detected by the `_LIBCPP_ASSERT_INTERNAL(__pos < __bits_per_word`, "...")` check in the last push. So I carefully analyzed why we didn't cause any failure to the algorithms when I first made the change in this patch. The reason is below. 

When these algorithms accepts two `__bit_iterator` arguments `__first` and `__last`, the algorithms with `__bit_iterator` optimizations do not use the usual loop in standard algorithms such as: 

```cpp
for (; __first != __last; ++__first)
   operate on *__first;
```

Instead, all the `__bit_iterator` optimized algorithms would first calculate the distance between the iterator pairs and then count down the counter value: 

```cpp
difference_type __n = __last - __first;
while (__n) {
  do operations on __first;
  ++__first;
  --n;
}
``` 

With the correct past-the-end iterator `__bit_iterator(/* __seg_ = */__first + 1, /*__ctz_ = */ 0)` and the incorrect one `__bit_iterator(__first, __bits_per_word)`, the distance `__n` is always equal: 

```cpp
__n = __bit_iterator(__first + 1, 0) - __bit_iterator(__first, 0) = (__first + 1 - __first) * __bits_per_word = __bits_per_word;
__n = __bit_iterator(__first, __bits_per_word ) - __bit_iterator(__first, 0) = __bits_per_word - 0 = __bits_per_word;
```

This explains why my initial change did not cause any problem. 

After realizing this, I think the precondition should be `__pos <= __bits_per_word` (allowing `==` case to accommodate the past-the-end iterator). I still think for the single-word, we can avoid the more expensive `/` and '%' operations by using a ternary operator which handles the `==` case as a special case. Also, I think it would be great that our `__bit_iterator` should require that `__ctz_ < __bits_per_word`, then this would detect my mistake in the first place. Additionally, operations within `__bit_iterator` actually does require `__ctz_ < __bits_per_word`. For example, `__bit_iterator::operator*()` would have undefined behavior when `__ctz == __bits_per_word` (because the operation `__storage_type(1) << __ctz_` is UB when shifting amount is not strictly less than the word with [1]). So I've add the precondition `__pos < __bits_per_word` (strictly less) for `__bit_iterator` and `__pos <= __bits_per_word` (with ==) for bitset.




[1] https://en.cppreference.com/w/cpp/language/operator_arithmetic#Bitwise_shift_operators
> If the value of rhs is negative or is not less than the number of bits in lhs, the behavior is undefined.




https://github.com/llvm/llvm-project/pull/121312