[libcxx-commits] [libcxx] [libc++][ranges] optimize the performance of `ranges::starts_with` (PR #84570)

Nikolas Klauser via libcxx-commits libcxx-commits at lists.llvm.org
Fri May 10 01:53:13 PDT 2024


philnik777 wrote:

Sorry for being so slow on this. I had a lot to do recently. TL;DR it's complicated.

The long version: AVX2 seems to make quite a significant difference for `mismatch`. @xiaoyang-sde I guess you compiled without any `-m` flags? The glibc functions are heavily optimized for a lot of platforms, and chooses the implementation based on what CPU it's running on. I wasn't able to benchmark yet, but it seems that other implementations aren't optimized as well, making the performance improvement dependent on your compilation flags and what libc you're using (e.g. musl doesn't seem to have any optimized `memcmp`, which would make our implementation 40x faster). We would be able to do something similar to what glibc does with `[[gnu::target_clones]]` or `[[gnu::target_version]]`, but I'm not sure it's worth the cost.

I think we have to have a more general discussion on how far we want to go for improved performance. For now I don't want to block this change, but I'd like us to have a more rigid understanding of how far we want to go before doing similar optimizations (i.e. selecting our vector code vs. the libc function).


https://github.com/llvm/llvm-project/pull/84570


More information about the libcxx-commits mailing list