[PATCH] D56593: [SelectionDAG][RFC] Allow the user to specify a memeq function (v5).

Clement Courbet via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 12 07:24:26 PST 2019

courbet added a comment.

In D56593#1393325 <https://reviews.llvm.org/D56593#1393325>, @jyknight wrote:

> It'd be great to see this somewhat more widely publicized, outside of just the clang community. If libc implementors are aware of the gains and are willing to provide an actually-faster bcmp implementation, it'd be a lot better, than having this optimization that doesn't really optimize anything without users providing their own bcmp implementation.

> Given the potential for gains reported, I'd hate to see this as a change that people can't actually take advantage of.
> A couple things I'd worry about, which I think this change is doing properly, but just to double check:
> - I assume that with -ffreestanding, this will be disabled.
> - Some folks avoid -ffreestanding, even though they have a freestanding implementation (sigh). For them, I assume -fno-builtin=bcmp will also disable this. We should document this change for such folk, as they will need to either add the flag, or provide their own bcmp implementation.

While compiling the following code:

  return memcmp(a, b, i) == 0;

1 - `-ffreestanding` will emit `memcmp()` instead of `bcmp()`, because freestanding disables `bcmp()` as a library function.
2 - `-fno-builtin-memcmp` will emit `memcmp()` instead of `bcmp()`, because this one instructs LLVM to not treat calls to memcmp() specially.
3 - `-fno-builtin-bcmp` will emit `bcmp()`. AFAICT, this is reasonable because this flag is supposed to tell clang/llvm to not treat `bcmp` specially, not to avoid using it (which is what `freestanding` is). That being said, I can see why the behaviour you're suggesting is good for users, so we probably want to teach clang to treat `bcmp` as a builtin: https://reviews.llvm.org/D58120

> Some other transforms in SimplifyLibCalls transform strcmp and strncmp into memcmp. I'm not sure if these optimizations will iterate or not -- will this properly transform strcmp -> memcmp -> bcmp, where appropriate?

They do compose, I've added a test to show that.

> Finally, I note that we don't optimize user code which calls bcmp the way we do user code which calls memcmp. Neither in ExpandMemCmp, or SimplifyLibCalls do we handle bcmp. While that's not something that needs to be simultaneously with this change, probably we should be doing so.

That's a very good point. I've created PR40699 with this.




More information about the llvm-commits mailing list