<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Missed optimization: removal of 'inc' and 'add' instructions"
   href="https://bugs.llvm.org/show_bug.cgi?id=44303">44303</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Missed optimization: removal of 'inc' and 'add' instructions
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>clang
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>MacOS X
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>C++
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedclangbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>jhg023@bucknell.edu
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>blitzrakete@gmail.com, dgregor@apple.com, erik.pilkington@gmail.com, llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
          </td>
        </tr></table>
      <p>
        <div>
        <pre>uint64_t test( uint64_t a, uint64_t b, uint64_t n, int norm ) { // Line 1
      uint64_t prod = a * b;                                        // Line 2
      uint64_t r = ( norm - ( prod + 1 ) * n ) + n;                 // Line 3
      return ( r < n ) ? r : ( r - n );                             // Line 4
    }

The above function generates the following assembly, according to Godbolt
(<a href="https://godbolt.org/z/6AKLMu">https://godbolt.org/z/6AKLMu</a>):

    test(unsigned long, unsigned long, unsigned long, int):                     
    # @test(unsigned long, unsigned long, unsigned long, int)
            imul    rdi, rsi
            movsxd  rax, ecx
            inc     rdi
            imul    rdi, rdx
            sub     rax, rdi
            add     rdx, rax
            cmovb   rax, rdx
            ret

As you can see, line 3 of the function is equivalent to the following snippet:

    uint64_t r = ( norm - prod * n );

With this change, I would expect the 'inc' and 'add' instructions to simply be
removed from the generated assembly. However, this isn't the case. In fact, it
generates assembly that's longer than the initial assembly
(<a href="https://godbolt.org/z/LkV4_D">https://godbolt.org/z/LkV4_D</a>):

    test(unsigned long, unsigned long, unsigned long, int):                     
    # @test(unsigned long, unsigned long, unsigned long, int)
            imul    rdi, rsi
            movsxd  rax, ecx
            imul    rdi, rdx
            sub     rax, rdi
            xor     ecx, ecx
            cmp     rax, rdx
            cmovae  rcx, rdx
            sub     rax, rcx
            ret

Is this correct, or rather a missed optimization?</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>