<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - Size-inefficient lowering of zero-memset"
   href="https://llvm.org/bugs/show_bug.cgi?id=25725">25725</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Size-inefficient lowering of zero-memset
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: X86
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>hans@chromium.org
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Example:

struct s {
  int x[100];
};
void f(struct s *s) {
  for (int i = 0; i < 16; i++) {
    s->x[i] = 0;
  }
}

With -Os -m32, Clang generates:

00000000 <f>:
   0:   8b 44 24 04             mov    0x4(%esp),%eax
   4:   31 c9                   xor    %ecx,%ecx
   6:   89 48 04                mov    %ecx,0x4(%eax)
   9:   89 08                   mov    %ecx,(%eax)
   b:   89 48 0c                mov    %ecx,0xc(%eax)
   e:   89 48 08                mov    %ecx,0x8(%eax)
  11:   89 48 14                mov    %ecx,0x14(%eax)
  14:   89 48 10                mov    %ecx,0x10(%eax)
  17:   89 48 1c                mov    %ecx,0x1c(%eax)
  1a:   89 48 18                mov    %ecx,0x18(%eax)
  1d:   89 48 24                mov    %ecx,0x24(%eax)
  20:   89 48 20                mov    %ecx,0x20(%eax)
  23:   89 48 2c                mov    %ecx,0x2c(%eax)
  26:   89 48 28                mov    %ecx,0x28(%eax)
  29:   89 48 34                mov    %ecx,0x34(%eax)
  2c:   89 48 30                mov    %ecx,0x30(%eax)
  2f:   89 48 3c                mov    %ecx,0x3c(%eax)
  32:   89 48 38                mov    %ecx,0x38(%eax)
  35:   c3                      ret

If I bump 16 to 17 in the example above, we get the very short:

00000000 <f>:
   0:   57                      push   %edi
   1:   8b 7c 24 08             mov    0x8(%esp),%edi
   5:   31 c0                   xor    %eax,%eax
   7:   b9 11 00 00 00          mov    $0x11,%ecx
   c:   f3 ab                   rep stos %eax,%es:(%edi)
   e:   5f                      pop    %edi
   f:   c3                      ret


X86TargetLowering sets "MaxStoresPerMemsetOptSize = 8;", which seems pretty
reasonable.

I think what's happening is we figure this can we done with 8 64-bit moves,
which is below the threshold, and then later we expand to 32-bit operations and
end up way above the threshold.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>