<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Avoid unnecessary widening"
   href="https://bugs.llvm.org/show_bug.cgi?id=50167">50167</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Avoid unnecessary widening
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Windows NT
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Scalar Optimizations
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>david.bolvansky@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>From Reddit:

void scale(uint8_t *__restrict dst, const uint8_t *__restrict src1) {
    for(int i=0; i<64; ++i) {
        int v = src1[i] * 3;

        if (v > 255)
            v = 255;

        dst[i] = (uint8_t)v;
    }
}

"Clang manages a decent attempt and recognizes that it can reduce the
multiplication to adds, but fails to stay at byte width and ends up widening
all the way to int, which seriously hurts throughput. It seems that the
optimizer failed to track value ranges properly as it could have stayed at
short (16-bit), and the narrowing also has an unnecessary clamp that the
packuswb instruction already provides."

<a href="https://gcc.godbolt.org/z/85f6YYEq3">https://gcc.godbolt.org/z/85f6YYEq3</a></pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>