[libcxx-commits] [libcxx] [libc++] Add input validation for set_intersection() in debug mode. (PR #101508)

Tue Aug 6 05:23:46 PDT 2024

================
@@ -43,33 +43,31 @@
 
 #include "test_iterators.h"
 
-namespace {
-
-// __debug_less will perform an additional comparison in an assertion
-static constexpr unsigned std_less_comparison_count_multiplier() noexcept {
-#if _LIBCPP_HARDENING_MODE == _LIBCPP_HARDENING_MODE_DEBUG
-  return 2;
+// Debug mode provides no complexity guarantees, testing them would be a waste of effort.
----------------
ichaer wrote:

I agree with what you're saying, I unfortunately have a ton of first-hand experience with not being able to use the best tool for the job because it's too slow, but this isn't as simple as it may seem. I'll argue the case for what I did a little bit out of pride, but mostly because I believe I made this choice for valid reasons which I probably should have shared in the commit description or a code comment.

You'll see we had previously a function called `std_less_comparison_count_multiplier()`, which was already doing some of what we're continuing here: adding a constant factor to an operation when in debug mode. When I added that I was very uncomfortable with how much it looked like a [change detector](https://testing.googleblog.com/2015/01/testing-on-toilet-change-detector-tests.html), but it was just the one and it looked harmless enough, so I just did that. I could make it a less effective change detector by adding some padding to my constant factor and using it for all operations, but the improvement we made to `set_intersection()` has a twist to it: the worst-case scenario complexity is linear, but in most cases the effective complexity is logarithmic, and we do check some of those cases in the complexity test (because it's an improvement and we don't want to break it, right?). Input validation, however, is linear. So, conceptually, using a larger constant doesn't really work. In practice it would, because, in practice, we only test with (small) fixed-size inputs, but it's conceptually wrong. Then there is the question of the arbitrarily padded constant: if it's arbitrarily padded, how much is it really helping? It's still a change detector, a test which will at some point break when someone adds more debug validations. With all that, is the test still a net positive? And let's say we changed the logic of complexity validation in debug mode so that we checked that it's linear with a less-padded constant, would that even make sense?

I agree that performance on debug mode is not something we want to ignore, but I don't have a good solution for any of it, and, since I don't have a good solution for automating validation, I would personally prefer relying on the safety net of code reviews to catch this sort of stuff. Validating the number of operations is really strict, it's a tricky thing to mix with debug instrumentation.

Having said all that, I'm happy to change this proposal if you disagree. We could have a single constant multiplier to be used for all operations, or one constant for comparisons+projections and another one for iterator operations.

What do you think?

https://github.com/llvm/llvm-project/pull/101508