[llvm-bugs] [Bug 25725] New: Size-inefficient lowering of zero-memset

via llvm-bugs llvm-bugs at lists.llvm.org
Wed Dec 2 16:14:39 PST 2015


https://llvm.org/bugs/show_bug.cgi?id=25725

            Bug ID: 25725
           Summary: Size-inefficient lowering of zero-memset
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: hans at chromium.org
                CC: llvm-bugs at lists.llvm.org
    Classification: Unclassified

Example:

struct s {
  int x[100];
};
void f(struct s *s) {
  for (int i = 0; i < 16; i++) {
    s->x[i] = 0;
  }
}

With -Os -m32, Clang generates:

00000000 <f>:
   0:   8b 44 24 04             mov    0x4(%esp),%eax
   4:   31 c9                   xor    %ecx,%ecx
   6:   89 48 04                mov    %ecx,0x4(%eax)
   9:   89 08                   mov    %ecx,(%eax)
   b:   89 48 0c                mov    %ecx,0xc(%eax)
   e:   89 48 08                mov    %ecx,0x8(%eax)
  11:   89 48 14                mov    %ecx,0x14(%eax)
  14:   89 48 10                mov    %ecx,0x10(%eax)
  17:   89 48 1c                mov    %ecx,0x1c(%eax)
  1a:   89 48 18                mov    %ecx,0x18(%eax)
  1d:   89 48 24                mov    %ecx,0x24(%eax)
  20:   89 48 20                mov    %ecx,0x20(%eax)
  23:   89 48 2c                mov    %ecx,0x2c(%eax)
  26:   89 48 28                mov    %ecx,0x28(%eax)
  29:   89 48 34                mov    %ecx,0x34(%eax)
  2c:   89 48 30                mov    %ecx,0x30(%eax)
  2f:   89 48 3c                mov    %ecx,0x3c(%eax)
  32:   89 48 38                mov    %ecx,0x38(%eax)
  35:   c3                      ret

If I bump 16 to 17 in the example above, we get the very short:

00000000 <f>:
   0:   57                      push   %edi
   1:   8b 7c 24 08             mov    0x8(%esp),%edi
   5:   31 c0                   xor    %eax,%eax
   7:   b9 11 00 00 00          mov    $0x11,%ecx
   c:   f3 ab                   rep stos %eax,%es:(%edi)
   e:   5f                      pop    %edi
   f:   c3                      ret


X86TargetLowering sets "MaxStoresPerMemsetOptSize = 8;", which seems pretty
reasonable.

I think what's happening is we figure this can we done with 8 64-bit moves,
which is below the threshold, and then later we expand to 32-bit operations and
end up way above the threshold.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20151203/5e97c89e/attachment.html>


More information about the llvm-bugs mailing list