[libcxx-commits] [PATCH] D146421: [release/16.x][libc++] Revert the bitset sort optimization

Wed Mar 29 06:47:39 PDT 2023

ldionne added a comment.

In D146421#4228982 <https://reviews.llvm.org/D146421#4228982>, @danlark wrote:

>> Shuffling the range wouldn't help here -- the problem wasn't non-determinism in the output, it's that we literally go and read out-of-bounds when the comparator is wrong. I think this is happening in __insertion_sort_unguarded, and I think I can add an assertion check just in there to "fix" the issue but I haven't had time to get the reproducer integrated in our test suite yet.
>
> In the previous implementation, out of bounds is possible, see Howard Hinnant on stackoverflow <https://stackoverflow.com/questions/38966516/should-sorting-algorithm-pass-same-element-in-the-comparison-function>. The sort will sigsegv with 31 elements with the comparator that always returns true.

Yeah, I was aware of this and in fact I have a downstream bug report about `std::sort` mis-behaving on floats containing NaNs, but I wasn't aware of that specific SO question, thanks for pointing it out.

> In this patch these lines start with 561 on the right. New implementation triggers unguarded optimaztion with fewer elements. That's the root cause. The rollback likely only hides the problem for customers which sort fewer elements.
>
> FWIW, shuffling can help finding these problems, say, pivot takes NaN and thus it can go into the unguarded case.

This patch is not so much about fixing the underlying issue, which would require more work than we can do in the LLVM 16 time frame, given that it's already been released. This patch is about not rocking the boat too much for release/16.x, since this change in behavior was arguably not planned (BTW if it was conscious, it could have been communicated during the original review). In other words, I am OK with shipping the same slightly broken `std::sort` that we've been shipping for a long time since if users have incorrect comparators, their code seems to be working fine with it. I am not blessing their code, but at least we're not making an active change that's going to break them or introduce serious issues we could be held responsible for.

Now, I am OK with making an active change that will break them, but only if we have a decent story for catching these potentially serious issues like I'm proposing in D147089 <https://reviews.llvm.org/D147089>.

Does that make sense?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146421/new/

https://reviews.llvm.org/D146421