<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - [scheduler] memset stores don't occur from low-to-high addresses"
   href="https://llvm.org/bugs/show_bug.cgi?id=27143">27143</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>[scheduler] memset stores don't occur from low-to-high addresses
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Common Code Generator Code
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>spatel+llvm@rotateright.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>This was noticed in the output of <a href="http://reviews.llvm.org/D18566">http://reviews.llvm.org/D18566</a> (<a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - Optimize memset for AVX2"
   href="show_bug.cgi?id=27100">bug 27100</a>):

define void @store_32_bytes(i8* %x, i32 %v, i8 %c) {
  call void @llvm.memset.p0i8.i32(i8* %x, i8 42, i32 32, i32 1, i1 false)
  ret void
}

declare void @llvm.memset.p0i8.i32(i8* nocapture, i8, i32, i32, i1) nounwind

$ ./llc -o - memset.ll 
...
    movabsq    $3038287259199220266, %rax ## imm = 0x2A2A2A2A2A2A2A2A
    movq    %rax, 24(%rdi)
    movq    %rax, 16(%rdi)
    movq    %rax, 8(%rdi)
    movq    %rax, (%rdi)


The inverted order (high to low address offsets) of the stores from what was
created in the DAG suggests that the scheduler is doing unnecessary work with
no hope of any actual perf improvement. 

In the worst case, going backwards through memory might cause a perf regression
for simple HW that only detects forward accesses for prefetching.

This is not an x86-specific bug. I see the same behavior for PPC64 and AArch64.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>