[libcxx-commits] [PATCH] D59999: Allow the compiler to optimize `string == "literal string"`.

Tue Jul 30 08:35:06 PDT 2019

EricWF added a comment.

In D59999#1604647 <https://reviews.llvm.org/D59999#1604647>, @mclow.lists wrote:

> @EricWF asked me:
>
> > Can you restate your concern over bypassing char_traits?

Thanks marshall.

> 
> 
> 1. The //real// problem here is that the string code is externally instantiated in the dylib, and thus not available to be inlined. This patch in no way addresses that, merely works around it in this one particular case for one particular operation (equality comparison) on one particular kind of data (narrow character literals).  I'm not really fond of solutions like that.

That's a potential part of the performance problem, but this change also addresses some other missed optimizations that only occur when LLVM sees the construct `memcmp(...) == 0`. When LLVM sees that it knows it only has to detect a difference, but not how the strings actually differ.
See https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp#L941-L943

> 2. There's been no attempt to show that this is a performance bottleneck; just assertions that "the status quo bloats and slows down callers for very common simple comparisons with literal strings". Improving a micro-benchmark may or may not have an effect on real-world code.

Tests were previously checked in that demonstrate the improvement. Microbenchmarks are going to be the only easy way to demonstrate the performance difference, but we have confirmed the performance win to real world code.

> 3. The standard is quite clear here. The comparison calls `string::compare`, which calls `char_traits::compare`. With this patch, we're no longer doing that. This is probably allowed by the "as if" rule, but still seems "wrong" to me.

Right, this is allowed under the `as-if` rule. It also doesn't use an `effects equivalent to` clause, so the standard only specifies what the return value should be, not how it's actually computed. 
It's a bit unfortunate to duplicate code, but the performance wins seem worth it.

> People who do not care about ABI stability could (should?) remove the external template instantiations, and get the benefits of inlining here (and in other calls).  If someone were to do this, and report back, we would have data to argue with about whether or not this patch would make any difference to actual users.

As I mentioned earlier, this optimization is more than just allowing for compiler inlining. It's trying to get the `memcpy -> bcmp` optimization.

Hopefully this addresses your concerns. I plan to move this patch later today.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59999/new/

https://reviews.llvm.org/D59999