[llvm-bugs] [Bug 47669] New: It is slower to use std::string::operator+= with a char literal argument than a string literal

Mon Sep 28 08:10:49 PDT 2020

https://bugs.llvm.org/show_bug.cgi?id=47669

            Bug ID: 47669
           Summary: It is slower to use std::string::operator+= with a
                    char literal argument than a string literal
           Product: libc++
           Version: 7.0
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: All Bugs
          Assignee: unassignedclangbugs at nondot.org
          Reporter: pierre.tallotte at viacesi.fr
                CC: llvm-bugs at lists.llvm.org, mclow.lists at gmail.com

There is clang-tidy option performance-faster-string-find that detects the use
of the std::basic_string::find method (and related ones) with a single
character string literal as argument. According to it, the use of a character
literal is more efficient.

However, I performed a benchmark and noticed it is the case only for small
string (when the small string optimization is used).

Here is my code:

#include <benchmark/benchmark.h>
#include <string>

static void BM_string_literal(benchmark::State& state)
{
    std::string s;

    for (int i = 0; i < state.range(0); i++)
        s += 'a';

    s += 'b';

    benchmark::DoNotOptimize(s.data());
    benchmark::ClobberMemory();
    size_t pos;

    for (auto _ : state)
    {
        benchmark::DoNotOptimize(pos = s.find("b")); // "b" is a string
literal, it should be longer
        benchmark::ClobberMemory();
    }
}

BENCHMARK(BM_string_literal)->RangeMultiplier(2)->Range(8, 8<<10);

static void BM_char_literal(benchmark::State& state)
{
    std::string s;

    for (int i = 0; i < state.range(0); i++)
        s += 'a';

    s += 'b';

    benchmark::DoNotOptimize(s.data());
    benchmark::ClobberMemory();
    size_t pos;

    for (auto _ : state)
    {
        benchmark::DoNotOptimize(pos = s.find('b')); // 'b' is a char literal,
it should be faster
        benchmark::ClobberMemory();
    }
}
BENCHMARK(BM_char_literal)->RangeMultiplier(2)->Range(8, 8<<10);

BENCHMARK_MAIN();

According to clang-tidy, I should prefer the code in BM_char_literal which is
faster. However, the results of the benchmark are the following:

[BM_string_literal vs. BM_char_literal]/8                   -0.0760        
-0.0760             9             8            9             8
[BM_string_literal vs. BM_char_literal]/16                  -0.0757        
-0.0767             9             8            9             8
[BM_string_literal vs. BM_char_literal]/32                  +0.3812        
+0.3809             4             5            4             5
[BM_string_literal vs. BM_char_literal]/64                  +0.1609        
+0.1602             4             5            4             5
[BM_string_literal vs. BM_char_literal]/128                 +0.1946        
+0.1944             4             5            4             5
[BM_string_literal vs. BM_char_literal]/256                 +0.1616        
+0.1623             6             6            6             6
[BM_string_literal vs. BM_char_literal]/512                 +0.2225        
+0.2211             7             9            7             9
[BM_string_literal vs. BM_char_literal]/1024                +0.1052        
+0.1051            11            12            11            12
[BM_string_literal vs. BM_char_literal]/2048                +0.0789        
+0.0781            18            20            18            20
[BM_string_literal vs. BM_char_literal]/4096                +0.0349        
+0.0348            31            32            31            32
[BM_string_literal vs. BM_char_literal]/8192                +0.0053        
+0.0042            56            57            56            57

We can see it is faster using a string_literal when the std::string is at least
32 characters long (I can reproduce these results again and again, it is not a
variance issue).

Is clang-tidy wrong or is there a bug in libc++? Or is my benchmark wrong
somewhere?

To reproduce my case, here are the commands I used (on a debian-stable):

apt-get -y install clang libc++-dev libc++abi-dev git cmake python python-pip
git clone https://github.com/google/benchmark.git
git clone https://github.com/google/googletest.git benchmark/googletest
pushd benchmark
cmake -E make_directory "build"
cmake -E chdir "build" cmake -DCMAKE_C_COMPILER=clang
-DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Release
-DCMAKE_CXX_FLAGS="-stdlib=libc++" -DBENCHMARK_DOWNLOAD_DEPENDENCIES=ON ../
cmake --build "build" --config Release --target install
popd
pip install scipy
clang++ -stdlib=libc++ -O3 bench.cpp -lbenchmark -lpthread -o bench
./benchmark/tools/compare.py filters ./bench BM_string_literal BM_char_literal

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20200928/cdf0bb2d/attachment-0001.html>