<div dir="ltr"><div>Hi David</div><div> tks a lot, that makes much more clear!</div><div><br></div><div> Regards</div><div> Jun</div></div><div class="gmail_extra"><br><div class="gmail_quote">2018-02-01 18:03 GMT+08:00 David Chisnall <span dir="ltr"><<a href="mailto:David.Chisnall@cl.cam.ac.uk" target="_blank">David.Chisnall@cl.cam.ac.uk</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>On 31 Jan 2018, at 17:36, Jakub (Kuba) Kuderski via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br>
><br>
> If you want to get rid of memcpy altogether, you can take a look at this pass: <a href="https://github.com/seahorn/seahorn/blob/master/lib/Transforms/Scalar/PromoteMemcpy.cc" target="_blank" rel="noreferrer">https://github.com/seahorn/<wbr>seahorn/blob/master/lib/<wbr>Transforms/Scalar/<wbr>PromoteMemcpy.cc</a> .<br>
<br>
</span>There are at least four different places in LLVM where memcpy intrinsics are expanded to either sequences of instructions or calls:<br>
<br>
- InstCombine does it for very small memcpys (with a broken heuristic).<br>
<br>
- PromoteMemCpy does it mostly to expose other optimisation opportunities.<br>
<br>
- SelectionDAG does it (though in a pretty terrible way, because it can’t create new basic blocks and so can’t emit small loops)<br>
<br>
- Some back ends do it in cooperation with SelectionDAG to provide their own implementation.<br>
<br>
Whether you want a memcpy intrinsic or a sequence of loads and stores depends a little bit on what optimisation you’re doing next - some work better treating individual fields separately, some prefer to have a blob of memory that they can treat as a single entity.<br>
<br>
It’s also worth noting that LLVM’s handling of padding in structure fields is particularly bad. LLVM IR has two kinds of struct: packed an non-packed. The documentation doesn’t make it clear whether non-packed structs have padding at the end (and clang assumes that it doesn’t, some of the time). Non-padded structs do have padding in between fields for alignment. When lowering from C (or a language needing to support a C ABI), you sometimes end up with padding fields inserted by the front end. Optimisers have no way of distinguishing these fields from non-padding fields and so we only get rid of them if SROA extracts them and finds that they have no side-effect-free consumers. In contrast, the padding between fields in non-packed structs disappears as soon as SROA runs. This can lead to violations of C semantics, where padding fields should not change (because C defines bitwise comparisons on structs using memcmp). This can lead to subtly different behaviour in C code depending on the target ABI (we’ve seen cases where trailing padding is copied in one ABI but not in another, depending solely on pointer size).<br>
<span class="HOEnZb"><font color="#888888"><br>
David<br>
<br>
</font></span></blockquote></div><br></div>