<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - [aarch64] Redundant masks in downcast long multiply"
   href="https://bugs.llvm.org/show_bug.cgi?id=47883">47883</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>[aarch64] Redundant masks in downcast long multiply
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>10.0
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>Other
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: AArch64
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>husseydevin@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>arnaud.degrandmaison@arm.com, llvm-bugs@lists.llvm.org, smithp352@googlemail.com, Ties.Stuij@arm.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>All 64-bit backends appear to do this. 

However, I only think this is important on aarch64; x86_64, RISC-V64, MIPS64,
and PPC64 don't appear to have similar instructions that are any more efficient
than the current method, so I am filing this under aarch64.

Code:

  #include <stdint.h>

  uint64_t umull(uint64_t x0, uint64_t x1) {
    // Downcast + upcast works too
    return (x0 & 0xffffffff) * (x1 & 0xffffffff);
  }

  int64_t smull(int64_t x0, int64_t x1) {
    return (int64_t)(int32_t)x0 * (int64_t)(int32_t)x1;
  }


Expected assembly (and what GCC 9.3.0 emits):

umull:
        umull   x0, w0, w1
        ret

smull:
        smull   x0, w0, w1
        ret


Clang 10.0.1:

umull:
        and     x8, x0, #0xffffffff
        and     x9, x1, #0xffffffff
        mul     x0, x8, x9
        ret

smull:
        sxtw    x8, w0
        sxtw    x9, w1
        mul     x0, x8, x9
        ret

Note that if the parameters are 32-bit integers, the expected code is emitted.
However, LLVM always turns this:

  %2 = trunc i64 %0 to i32
  %3 = zext i32 %2 to i64

to this:

 %2 = and i64 %0, 0xffffffff

so this does not work.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>