<div dir="ltr"><div>Hi  David</div><div>     tks a lot, that makes much more clear!</div><div><br></div><div>    Regards</div><div>    Jun</div></div><div class="gmail_extra"><br><div class="gmail_quote">2018-02-01 18:03 GMT+08:00 David Chisnall <span dir="ltr"><<a href="mailto:David.Chisnall@cl.cam.ac.uk" target="_blank">David.Chisnall@cl.cam.ac.uk</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>On 31 Jan 2018, at 17:36, Jakub (Kuba) Kuderski via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br>

><br>

> If you want to get rid of memcpy altogether, you can take a look at this pass: <a href="https://github.com/seahorn/seahorn/blob/master/lib/Transforms/Scalar/PromoteMemcpy.cc" target="_blank" rel="noreferrer">https://github.com/seahorn/<wbr>seahorn/blob/master/lib/<wbr>Transforms/Scalar/<wbr>PromoteMemcpy.cc</a> .<br>

<br>

</span>There are at least four different places in LLVM where memcpy intrinsics are expanded to either sequences of instructions or calls:<br>

<br>

- InstCombine does it for very small memcpys (with a broken heuristic).<br>

<br>

- PromoteMemCpy does it mostly to expose other optimisation opportunities.<br>

<br>

- SelectionDAG does it (though in a pretty terrible way, because it can’t create new basic blocks and so can’t emit small loops)<br>

<br>

- Some back ends do it in cooperation with SelectionDAG to provide their own implementation.<br>

<br>

Whether you want a memcpy intrinsic or a sequence of loads and stores depends a little bit on what optimisation you’re doing next - some work better treating individual fields separately, some prefer to have a blob of memory that they can treat as a single entity.<br>

<br>

It’s also worth noting that LLVM’s handling of padding in structure fields is particularly bad.  LLVM IR has two kinds of struct: packed an non-packed.  The documentation doesn’t make it clear whether non-packed structs have padding at the end (and clang assumes that it doesn’t, some of the time).  Non-padded structs do have padding in between fields for alignment.  When lowering from C (or a language needing to support a C ABI), you sometimes end up with padding fields inserted by the front end.  Optimisers have no way of distinguishing these fields from non-padding fields and so we only get rid of them if SROA extracts them and finds that they have no side-effect-free consumers.  In contrast, the padding between fields in non-packed structs disappears as soon as SROA runs.  This can lead to violations of C semantics, where padding fields should not change (because C defines bitwise comparisons on structs using memcmp).  This can lead to subtly different behaviour in C code depending on the target ABI (we’ve seen cases where trailing padding is copied in one ABI but not in another, depending solely on pointer size).<br>

<span class="HOEnZb"><font color="#888888"><br>

David<br>

<br>

</font></span></blockquote></div><br></div>