[llvm] r247269 - [ADT] Rewrite the StringRef::find implementation to be simpler, clearer,
David Blaikie via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 10 08:17:03 PDT 2015
On Thu, Sep 10, 2015 at 4:17 AM, Chandler Carruth via llvm-commits <
llvm-commits at lists.llvm.org> wrote:
> Author: chandlerc
> Date: Thu Sep 10 06:17:49 2015
> New Revision: 247269
>
> URL: http://llvm.org/viewvc/llvm-project?rev=247269&view=rev
> Log:
> [ADT] Rewrite the StringRef::find implementation to be simpler, clearer,
> and tremendously less reliant on the optimizer to fix things.
>
> The code is always necessarily looking for the entire length of the
> string when doing the equality tests in this find implementation, but it
> previously was needlessly re-checking the size each time among other
> annoyances.
>
> By writing this so simply an ddirectly in terms of memcmp, it also is
> about 8x faster in a debug build, which in turn makes FileCheck about 2x
> faster in 'ninja check-llvm'.
Should we deliberately build FileCheck optimized by default even in debug
builds? I think we do something like that for llvm-tblgen, maybe we could
broaden that option/flag/support?
> This saves about 8% of the time for
> FileCheck-heavy parts of the test suite like the x86 backend tests.
>
> Modified:
> llvm/trunk/lib/Support/StringRef.cpp
>
> Modified: llvm/trunk/lib/Support/StringRef.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/StringRef.cpp?rev=247269&r1=247268&r2=247269&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Support/StringRef.cpp (original)
> +++ llvm/trunk/lib/Support/StringRef.cpp Thu Sep 10 06:17:49 2015
> @@ -140,37 +140,44 @@ std::string StringRef::upper() const {
> /// \return - The index of the first occurrence of \arg Str, or npos if
> not
> /// found.
> size_t StringRef::find(StringRef Str, size_t From) const {
> + if (From > Length)
> + return npos;
> +
> + const char *Needle = Str.data();
> size_t N = Str.size();
> - if (N > Length)
> + if (N == 0)
> + return From;
> +
> + size_t Size = Length - From;
> + if (Size < N)
> return npos;
>
> + const char *Start = Data + From;
> + const char *Stop = Start + (Size - N + 1);
> +
> // For short haystacks or unsupported needles fall back to the naive
> algorithm
> - if (Length < 16 || N > 255 || N == 0) {
> - for (size_t e = Length - N + 1, i = std::min(From, e); i != e; ++i)
> - if (substr(i, N).equals(Str))
> - return i;
> + if (Size < 16 || N > 255) {
> + do {
> + if (std::memcmp(Start, Needle, N) == 0)
> + return Start - Data;
> + ++Start;
> + } while (Start < Stop);
> return npos;
> }
>
> - if (From >= Length)
> - return npos;
> -
> // Build the bad char heuristic table, with uint8_t to reduce cache
> thrashing.
> uint8_t BadCharSkip[256];
> std::memset(BadCharSkip, N, 256);
> for (unsigned i = 0; i != N-1; ++i)
> BadCharSkip[(uint8_t)Str[i]] = N-1-i;
>
> - unsigned Len = Length-From, Pos = From;
> - while (Len >= N) {
> - if (substr(Pos, N).equals(Str)) // See if this is the correct
> substring.
> - return Pos;
> + do {
> + if (std::memcmp(Start, Needle, N) == 0)
> + return Start - Data;
>
> // Otherwise skip the appropriate number of bytes.
> - uint8_t Skip = BadCharSkip[(uint8_t)(*this)[Pos+N-1]];
> - Len -= Skip;
> - Pos += Skip;
> - }
> + Start += BadCharSkip[(uint8_t)Start[N-1]];
> + } while (Start < Stop);
>
> return npos;
> }
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150910/c67d04d5/attachment.html>
More information about the llvm-commits
mailing list