<html>
    <head>
      <base href="http://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - [memcpyopt] Missed memcpy->memset optimization"
   href="http://llvm.org/bugs/show_bug.cgi?id=22758">22758</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>[memcpyopt] Missed memcpy->memset optimization
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Scalar Optimizations
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>listmail@philipreames.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvmbugs@cs.uiuc.edu
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>We currently don't have a dedicated memcpy to memset conversion optimization. 
We're instead relying on a call slot optimization which is unneccessary
restrictive.  In particular, it is only the contents of the source region of
memory which needs to be known, not the contents of the destination memory.

In the example below, we put a non-zero value into the destination location,
and then immediately copy zeros over it.  We fail to convert the memcpy to a
memset if we run only memcpyopt.  

Note that DSE does eliminate the dead store and thus the combination works just
fine.  You could find an example that didn't (maybe a partially dead store?),
but I didn't bother for the purposes of reporting this.  



target datalayout = "e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

declare void @foo([100000 x i32]*)

; Function Attrs: nounwind
declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i32, i1) #0

; Function Attrs: nounwind
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture readonly,
i64, i32, i1) #0

define void @testfunc() {
  %src = alloca [100000 x i32], align 4
  %dst = alloca [100000 x i32], align 4
  %1 = bitcast [100000 x i32]* %src to i8*
  %2 = bitcast [100000 x i32]* %dst to i8*
  call void @llvm.memset.p0i8.i64(i8* %1, i8 0, i64 400000, i32 4, i1 false)
  store i8 47, i8* %2
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %2, i8* %1, i64 400000, i32 4, i1
false)
  call void @foo([100000 x i32]* %dst)
  ret void
}</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>