<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - set_union: clang makes code significantly slower by incorrect optimizations."
   href="https://bugs.llvm.org/show_bug.cgi?id=35502">35502</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>set_union: clang makes code significantly slower by incorrect optimizations.
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>clang
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>5.0
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>-New Bugs
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedclangbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>denis.yaroshevskij@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Hi!

I've been doing a small research project, where I needed a version of
set_union, that would be biased to the left range.

Here are two implementations, V6 is the one I wanted to write, V7 is the one I
had to right, in order to stop bad codegen.

<a href="https://godbolt.org/g/axVMWr">https://godbolt.org/g/axVMWr</a>

How it suppose to work (v6):
If I have elements from the second range, I loop between checkSecond and start.
As soon as I get more elements from the left I start falling through the
unrolled loop.

checkSecond is responsible for the whole right range, so it's crucial not to
slow it down. However, this is not what clang does.

If I move my first unrolled iteration to the top of checkSecond (see V7) clang
stops trying to optimize my unrolled loop at the expense of checkSecond, and
the trick works.

The first version is 2-2,5 times slower for some cases and never faster.

Benchmark:
2000 uniformly distributed 64 bit integers.
They are distributed in two vectors.

>From left to right - less integers in the left vector, more integers in the
right vector. (On the very left: - 2000/0, middle - 1000/1000, the very right -
0/2000).

Results: <a href="https://plot.ly/~dyaroshev/47/">https://plot.ly/~dyaroshev/47/</a>

Compiler: Apple LLVM version 9.0.0 (clang-900.0.38) (I don't know what is the
real clang version this is). At the version 5 assembly is still wrong - see
godbolt.

Compiler options: clang++ --std=c++14 -O3 -Werror -Wall

Benchmmark code:
<a href="https://github.com/DenisYaroshevskiy/srt-library/blob/master/other_benchmarks/set_unions_bench.cc">https://github.com/DenisYaroshevskiy/srt-library/blob/master/other_benchmarks/set_unions_bench.cc</a>

Related bug: <a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Faster implementation of std::set_union"
   href="show_bug.cgi?id=35499">https://bugs.llvm.org/show_bug.cgi?id=35499</a>

Best,
Denis.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>