[LLVMbugs] [Bug 19900] New: string::operator==(const char *) unnecessarily computes rhs length, then doesn't take advantage of it

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Fri May 30 13:25:49 PDT 2014


http://llvm.org/bugs/show_bug.cgi?id=19900

            Bug ID: 19900
           Summary: string::operator==(const char *) unnecessarily
                    computes rhs length, then doesn't take advantage of it
           Product: libc++
           Version: 3.4
          Hardware: Macintosh
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: All Bugs
          Assignee: unassignedclangbugs at nondot.org
          Reporter: corydoras at ridiculousfish.com
                CC: llvmbugs at cs.uiuc.edu, mclow.lists at gmail.com
    Classification: Unclassified

I noticed that my code was spending an inordinate amount of time in
std::string::operator!=(const char *). I dug and found that the implementation
is quite suboptimal, always traversing the rhs at least once. This bug tracks
optimizing it.

The std::string::operator== that takes a C string is defined like so:

    operator==(const basic_string<_CharT,_Traits,_Allocator>& __lhs,
           const _CharT* __rhs) _NOEXCEPT
    {
        return __lhs.compare(__rhs) == 0;
    }

where the eventual compare function is:

    template <class _CharT, class _Traits, class _Allocator>
    int
    basic_string<_CharT, _Traits, _Allocator>::compare(const value_type* __s)
const _NOEXCEPT
    {
        _LIBCPP_ASSERT(__s != nullptr, "string::compare(): received nullptr");
        return compare(0, npos, __s, traits_type::length(__s));
    }


There's two ways in which this is suboptimal:

1. The call to type_traits::length is a strlen, and may be expensive if the rhs
is long. It is also unnecessary; i.e. it should do the moral equivalent of
strcmp instead of strlen + memcmp.

2. Once we've incurred the expense of computing the length, if the lengths are
different, the strings must not be equal. But it does not use that information
until after the full character-by-character comparison!

I poked in the standard for operator== and found "Notes: Uses
traits::length()," so I worry that the standard requires this absurd
implementation. But here's hoping we can make it better.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20140530/7e7a8f7b/attachment.html>


More information about the llvm-bugs mailing list