<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - Slow code generated for memcpy of a `zext i1`"
   href="https://llvm.org/bugs/show_bug.cgi?id=31001">31001</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Slow code generated for memcpy of a `zext i1`
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>3.9
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Scalar Optimizations
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>arielb1@mail.tau.ac.il
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>A memcpy with a constant length is lowered to a (fast) sequence of load and
store instructions. A memcpy with a non-constant length is lowered to a call to
the memcpy function, which is slow for short copies.

For example, a memcpy of a `zext i1` is equivalent to a conditional load and
store of a single byte, but the generated IR (and ASM) contains a call to
memcpy:


```
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture
readonly, i64, i32, i1) #2

define i8 @test_load_store(i1 %cond, i8* %buf) {
  %_result = alloca i8, align 8
  %_len = zext i1 %cond to i64
  store i8 0, i8* %_result
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull %_result, i8* nonnull %buf,
i64 %_len, i32 1, i1 false)
  %_ret = load i8, i8* %_result
  ret i8 %_ret
}
```

This causes slowness in Rust's Cursor::read, which we discovered in PR
<a href="https://github.com/rust-lang/rust/pull/37573">https://github.com/rust-lang/rust/pull/37573</a>..</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>