[PATCH] D56593: [SelectionDAG][RFC] Allow the user to specify a memeq function (v5).

Noah Goldstein via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Nov 6 11:26:03 PDT 2021


goldstein.w.n added a comment.
Herald added subscribers: ecnelises, pengfei.

In D56593#1393325 <https://reviews.llvm.org/D56593#1393325>, @jyknight wrote:

> It'd be great to see this somewhat more widely publicized, outside of just the clang community. If libc implementors are aware of the gains and are willing to provide an actually-faster bcmp implementation, it'd be a lot better, than having this optimization that doesn't really optimize anything without users providing their own bcmp implementation.

Tried to add an optimized `bcmp` for GLIBC here <https://marc.info/?t=163157542200002&r=1&w=3>.

It was not well received because `bcmp` is not a standard function. It seems GLIBC supports `bcmp` out of reluctant necessity rather than any desire for it to be fast.

There was agreement that the functionality was useful so GLIBC landed on `__memcmpeq'`to get the functionality. The patches made it to HEAD <https://sourceware.org/git/?p=glibc.git;a=commit;h=44829b3ddb64e99e37343a0f25b2c082387d31a5>
and will be available starting with the 2.35 release. It declared in "string.h" or can be queried with GLIBC version >= 2.35.

Currently only x86_64 has an optimized version, the rest of the targets still just redirect to `memcmp`.

Working on a patch to add support in LLVM.

> Given the potential for gains reported, I'd hate to see this as a change that people can't actually take advantage of.
>
> A couple things I'd worry about, which I think this change is doing properly, but just to double check:
>
> - I assume that with -ffreestanding, this will be disabled.
> - Some folks avoid -ffreestanding, even though they have a freestanding implementation (sigh). For them, I assume -fno-builtin=bcmp will also disable this.
>
> We should document this change for such folk, as they will need to either add the flag, or provide their own bcmp implementation.
>
> Some other transforms in SimplifyLibCalls transform strcmp and strncmp into memcmp. I'm not sure if these optimizations will iterate or not -- will this properly transform strcmp -> memcmp -> bcmp, where appropriate?
>
> Finally, I note that we don't optimize user code which calls bcmp the way we do user code which calls memcmp. Neither in ExpandMemCmp, or SimplifyLibCalls do we handle bcmp. While that's not something that needs to be simultaneously with this change, probably we should be doing so.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56593/new/

https://reviews.llvm.org/D56593



More information about the llvm-commits mailing list