<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Missed optimization when summing every second number in unsigned"
   href="https://bugs.llvm.org/show_bug.cgi?id=52593">52593</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Missed optimization when summing every second number in unsigned
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>new-bugs
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Keywords</th>
          <td>missing-feature
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>new bugs
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>llvm-bugs@admitriev.name
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>htmldeveloper@gmail.com, llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Suppose we have a function

template<class T>
double foo(const std::vector<double>& v) {
    double sum = 0;
    T size = v.size();
    for (T i = 0; i < size; i += 2) {
        sum += v[i];
    }
    return sum;
}


and call it as foo<uint64_t> and foo<int64_t>.

The signed version is 1.7 times faster (both with libstdc++ and libc++)
<a href="https://quick-bench.com/q/EQxNHHtqi9497mG3ovtfG7fQa6s">https://quick-bench.com/q/EQxNHHtqi9497mG3ovtfG7fQa6s</a>

(Note that gcc generates code for both signed and unsigned version
approximately as fast as signed clang version
<a href="https://quick-bench.com/q/lDXyfset7rLDmOG2mGxjVb1h0Zw">https://quick-bench.com/q/lDXyfset7rLDmOG2mGxjVb1h0Zw</a>)

Indeed we can se on godbolt that code generated for different type is
different: <a href="https://godbolt.org/z/Kjv6q7dsr">https://godbolt.org/z/Kjv6q7dsr</a>
But seemingly there's no reason why they should be different.

One can argue that one possible difference is we can assume no overflow for
signed version. But actually we can assume no overflow in unsigned version as
well at least for three reasons:
1) if there's overflow, then the loop is infinite without any side-effects
which is UB
2)  the size guaranteed to be less then v.max_size() which is way less then 2
** 64
3) size is calculated as difference of 2 pointers to 8byte type, so it's
bounded by 2**61 or so
Note that I tried few ways to use __builtin_assume() to "proof" to the compiler
there's no overflow which didn't help.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>