<html>
    <head>
      <base href="http://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - Introduce a flag to indicate that src==dst is OK for @llvm.memcpy instruction"
   href="http://llvm.org/bugs/show_bug.cgi?id=16761">16761</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Introduce a flag to indicate that src==dst is OK for @llvm.memcpy instruction
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>clang
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>-New Bugs
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedclangbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>pbos@google.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvmbugs@cs.uiuc.edu
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>As tracked/WONTFIXed in <a class="bz_bug_link 
          bz_status_RESOLVED  bz_closed"
   title="RESOLVED WONTFIX - memcpy call with overlapping (same) regions is created by LLVM"
   href="show_bug.cgi?id=11763">bug 11763</a>: a = b; can generate memcpy() calls for
memcpy(x, x, sizeof(*x)); which is incorrect according to both the POSIX and C
memcpy interfaces (blocks may not overlap). When these calls are executed
Valgrind correctly triggers a warning:

==1340== Thread 8:
==1340== Source and destination overlap in memcpy(0x1231c188, 0x1231c188, 192)
==1340==    at 0x4C2CFA0: memcpy@@GLIBC_2.14 (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1340==    by 0x61D02D: webrtc::VP8DecoderImpl::InitDecode(webrtc::VideoCodec
const*, int) (vp8_impl.cc:566)
==1340==    by 0x61CE78: webrtc::VP8DecoderImpl::Reset() (vp8_impl.cc:516)
==1340==    by 0x4B1983: webrtc::VCMGenericDecoder::Reset()
(generic_decoder.cc:203)

Writing memmove() into the IR was rejected because of performance issues. The
case is probably similar for not writing if (src != dst) { memcpy(dst, src,
sizeof(*src)); } into it.


However, if a @llvm.memcpy call is generated with an i1 <src_may_be_dst> flag
then insertion of this branch can be deferred to the backend-code generation,
which is left with two options:

small @llvm.memcpy, inlined copy:
- generate inline instructions, no branching required (unless the used
instructions actually require src != dst). This is where branching could add
significant cost.

large @llvm.memcpy, generating actual memcpy() call:
- Generate if (src != dst) { memcpy(src, dst, sizeof(*src)); }. This operates
under my personal assumption that "if a memcpy() call must be inserted, the
branching overhead is not very significant". This branch is not inserted unless
src_may_be_dst.


Benefits for doing this should include:

1. Obeying the contract of the memcpy() interface. Personally I believe this is
very significant and not a small thing at all, even though it works with known
implementations.

2. Generating code that can be checked for correctness by Valgrind, without
having Valgrind change. This would include ASan as well, though I'm not sure
what the current state of suppressing that warning is. It would also permit
ASan to add that warning back.

3. Allow ASan/Valgrind to check explicit memcpy() calls for src == dst.

4. Performance when src is actually equal to dst, because it eliminates a
memcpy() call.


Previous relevant bugs/discussions:

LLVM bug: <a class="bz_bug_link 
          bz_status_RESOLVED  bz_closed"
   title="RESOLVED WONTFIX - memcpy call with overlapping (same) regions is created by LLVM"
   href="show_bug.cgi?id=11763">http://llvm.org/bugs/show_bug.cgi?id=11763</a>
GCC bug: <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39480#c6">http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39480#c6</a>
Valgrind discussion thread:
<a href="http://sourceforge.net/mailarchive/forum.php?thread_name=CAKGkouv84rFkKrQ%3DeK4FYR3J6nc%3D1wCY9pFw3U_uTeyqA_O%2B4A%40mail.gmail.com&forum_name=valgrind-developers">http://sourceforge.net/mailarchive/forum.php?thread_name=CAKGkouv84rFkKrQ%3DeK4FYR3J6nc%3D1wCY9pFw3U_uTeyqA_O%2B4A%40mail.gmail.com&forum_name=valgrind-developers</a></pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>