[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

Fri Jan 4 02:27:29 PST 2019

Thanks for the suggestions Hal,

So if I understand correctly, you're recommending we add a module flag
<https://llvm.org/docs/LangRef.html#module-flags-metadata> to LLVM,
something like:

!llvm.module.flags = !{..., !123}
!123 = !{i32 1, !"memeq_lib_function", !"user_memeq"}

I've given it a try in the following patch: https://reviews.llvm.org/D56311
If this sounds reasonable I can start working on adding a CodeGenOptions to
clang to see what this entails.

I don't think the function attribute works here because we want this to be
globally enabled instead of per-function (but maybe I misunderstood what
you were suggesting).

On Thu, Jan 3, 2019 at 6:40 PM Finkel, Hal J. <hfinkel at anl.gov> wrote:

>
> On 1/3/19 3:29 AM, Clement Courbet via llvm-dev wrote:
>
> Hi all,
>
> We'd like to suggest *adding a -memeq-lib-function* flag to allow the
> user to specify a `*memeq()*` function to improve string equality check
> performance.
>
> Hi, Clement,
>
> We really shouldn't be adding backend flags for anything at this point
> (except for debugging and the like). A function attribute should be fine,
> or global metadata if necessary. A function attribute should play better
> with LTO, and so that's generally the recommended design point.
>
>
>
> Right now, when llvm encounters a *string equality check*, e.g. `if
> (memcmp(a, b, s) == 0)`, it tries  to expand to an equality comparison if
> `s` is a small compile-time constant, and falls back on calling `memcmp()`
> else.
>
> This is sub-optimal because memcmp has to compute much more than equality.
>
> We propose adding a way for the user to specify a `memeq` library function
> (e.g. `-memeq-lib-function=user_memeq`) which will be called instead of
> `memcmp()` when the result of the memcmp call is only used for equality
> comparison.
>
> `memeq` can be made much more efficient than `memcmp` because equality
> comparison is trivially parallel while lexicographic ordering has a chain
> dependency.
>
> We measured an very large improvement of this approach on our internal
> codebase. A significant portion of this improvement comes from the stl,
> typically `std::string::operator==()`.
>
> Note that this is a *backend-only change*. Because the c family of
> languages do not have a standard `memeq()` (posix used to have `bcmp()` but
> it was removed in 2001), c/c++ code cannot communicate the equality
> comparison semantics to the compiler.
>
> We did not add an RTLIB entry for memeq because the user environment is
> not guaranteed to contain a `memeq()` function as the libc has no such
> concept.
>
> If there is interest, we could also contribute our optimized `memeq` to
> compiler-rt.
>
>
> That would be useful.
>
> Thanks again,
>
> Hal
>
>
>
> A proof of concept patch for this for this RFC can be found here:
> https://reviews.llvm.org/D56248
>
> Comments & suggestions welcome !
> Thanks,
>
> Clement
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190104/845aadba/attachment.html>