<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Aug 18, 2015 at 4:57 PM, John McCall <span dir="ltr"><<a href="mailto:rjmccall@apple.com" target="_blank">rjmccall@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>
> On Aug 16, 2015, at 3:08 AM, Hal Finkel <<a href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a>> wrote:<br>
><br>
> [+Richard, John, Chandler]<br>
><br>
> ----- Original Message -----<br>
>> From: "David Majnemer via llvm-commits" <<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>><br>
>> To: "Lang Hames" <<a href="mailto:lhames@gmail.com">lhames@gmail.com</a>><br>
>> Cc: "Commit Messages and Patches for LLVM" <<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>><br>
>> Sent: Wednesday, August 12, 2015 10:35:16 PM<br>
>> Subject: Re: [PATCH] Gut InstCombiner::SimplifyMemTransfer.<br>
>><br>
>><br>
>><br>
>> Hi Lang,<br>
>><br>
>><br>
>> I reviewed the InstCombine side and it looks great, all that I ask is<br>
>> that you replace the `Instruction *` with `auto *` because the type<br>
>> is replicated in the `dyn_cast`.<br>
>><br>
>><br>
>> My only concern is that clang will actually create calls to the<br>
>> memcpy intrinsic from within clang with pointers which exactly<br>
>> alias:<br>
>> <a href="http://clang.llvm.org/doxygen/CGExprAgg_8cpp_source.html#l01448" rel="noreferrer" target="_blank">http://clang.llvm.org/doxygen/CGExprAgg_8cpp_source.html#l01448</a><br>
>><br>
><br>
> That seems very broken. We need to fix this. In fact, the comment admits as much:<br>
><br>
> 01454 // memcpy is not defined if the source and destination pointers are exactly<br>
> 01455 // equal, but other compilers do this optimization, and almost every memcpy<br>
> 01456 // implementation handles this case safely. If there is a libc that does not<br>
> 01457 // safely handle this, we can add a target hook.<br>
><br>
> Is there any reason not to emit memmove instead?<br>
<br>
</span>Partial overlaps are generally not possible here — C guarantees that the overlap<br>
in an assignment has to be perfect. So a great deal of the added complexity of<br>
memmove is not needed; and as mentioned, no actual implementations of<br>
memcpy misbehave for perfect overlap.<br>
<br>
That said, I can appreciate wanting to keep the formal model sane. I suspect that<br>
emitting llvm.memmove would generally produce uses of the actual memmove,<br>
but I'd be happy to use a different intrinsic that allows only perfect overlap and which<br>
you can make a “platform-specific” decision to lower to memcpy in the backend.<br></blockquote><div><br></div><div>I suggested to Lang on IRC that we might want to add another operand to the memcpy intrinsic to model this behavior. Another intrinsic would, of course, work just as well. AFAICT, the only code which would care about the distinction between memcpy as-written-in-source and as-written-by-compiler which would make the extra flag a little less disruptive than another intrinsic.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<span class="HOEnZb"><font color="#888888"><br>
John.<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
><br>
> -Hal<br>
><br>
>><br>
>> I cannot confess to being an alias-analysis expert but my concern is<br>
>> that annotating the memcpy as such could create some interesting<br>
>> paradoxes. Those more familiar with AA should chime in.<br>
>><br>
>><br>
>> I'm afraid I cannot provide any guidance on the SelectionDAG changes.<br>
>><br>
>><br>
>> Thanks<br>
>> --<br>
>> David Majnemer<br>
>><br>
>><br>
>> On Wed, Aug 12, 2015 at 10:06 PM, Lang Hames via llvm-commits <<br>
>> <a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a> > wrote:<br>
>><br>
>><br>
>><br>
>> Another belated *ping*.<br>
>><br>
>><br>
>><br>
>> - Lang.<br>
>><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>> On Tue, Jun 30, 2015 at 2:16 PM, Lang Hames < <a href="mailto:lhames@gmail.com">lhames@gmail.com</a> ><br>
>> wrote:<br>
>><br>
>><br>
>><br>
>> Belated ping. :)<br>
>><br>
>><br>
>> - Lang.<br>
>><br>
>><br>
>><br>
>><br>
>> On Tue, May 12, 2015 at 2:11 PM, Lang Hames < <a href="mailto:lhames@gmail.com">lhames@gmail.com</a> ><br>
>> wrote:<br>
>><br>
>><br>
>><br>
>> Gentle ping.<br>
>><br>
>><br>
>> - Lang.<br>
>><br>
>><br>
>><br>
>><br>
>> On Tue, Apr 28, 2015 at 11:51 AM, Lang Hames < <a href="mailto:lhames@gmail.com">lhames@gmail.com</a> ><br>
>> wrote:<br>
>><br>
>><br>
>><br>
>> Ping / delayed reply.<br>
>><br>
>><br>
>> DannyB - thanks for the extra context. :)<br>
>><br>
>><br>
>> Chandler - any thoughts? Assuming you still want to keep this<br>
>> transform, what do you think of my instcombine change? (i.e. tagging<br>
>> the load/store pointers as non-aliasing for memcpy).<br>
>><br>
>><br>
>> Cheers,<br>
>> Lang.<br>
>><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>> On Tue, Apr 21, 2015 at 9:21 PM, Daniel Berlin < <a href="mailto:dberlin@dberlin.org">dberlin@dberlin.org</a><br>
>>> wrote:<br>
>><br>
>><br>
>> On Wed, Apr 8, 2015 at 10:38 PM, Chandler Carruth <<br>
>> <a href="mailto:chandlerc@gmail.com">chandlerc@gmail.com</a> > wrote:<br>
>>> On Wed, Apr 8, 2015 at 10:15 PM Lang Hames < <a href="mailto:lhames@gmail.com">lhames@gmail.com</a> ><br>
>>> wrote:<br>
>>>><br>
>>>> Hi David, Chandler,<br>
>>>><br>
>>>> The attached patch guts the SimplifyMemTransfer method, which<br>
>>>> turns small<br>
>>>> memcpys (1/2/4/8 bytes) into load/store pairs. Turning memcpys<br>
>>>> into<br>
>>>> load/store pairs loses information, since we can no longer assume<br>
>>>> the source<br>
>>>> and dest are non-overlapping. This is leading to some suboptimal<br>
>>>> expansions<br>
>>>> for small memcpys on AArch64 when -mno-unaligned-access is turned<br>
>>>> on (see<br>
>>>> r234462). I suspect other architectures would suffer similar<br>
>>>> issues.<br>
>>>><br>
>>>> I assume this transform is an old workaround to simplify other<br>
>>>> non-memcpy-aware IR transforms. These days I think most IR<br>
>>>> transforms can<br>
>>>> reason sensibly about memcpys, so I'm hoping this is safe to<br>
>>>> remove. FWIW,<br>
>>>> removing it didn't hit any regression tests except those that were<br>
>>>> verifying<br>
>>>> that this optimisation was being applied, but then you wouldn't<br>
>>>> really<br>
>>>> expect it to hit any others.<br>
>>><br>
>>><br>
>>> Heh. I tried to remove it before and it regressed a *lot* of<br>
>>> performance.<br>
>>> Have you measured it? I think there are many places that don't<br>
>>> today reason<br>
>>> about memcpy but do reason about loads and stores. Here is a<br>
>>> partial list:<br>
>>><br>
>>> - GVN<br>
>><br>
>> Note:<br>
>><br>
>> It reasons about memory intrinsics, including memcpy and memset, and<br>
>> will pull values out to forward.<br>
>><br>
>> for memset, if the load is in the memset size, it will compute what<br>
>> the value is.<br>
>> For memcpy, it handles anything that is constant memory.<br>
>><br>
>><br>
>> (Certainly, loads and stores are handled better, but ...)<br>
>><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>> _______________________________________________<br>
>> llvm-commits mailing list<br>
>> <a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
>> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
>><br>
>><br>
>><br>
>> _______________________________________________<br>
>> llvm-commits mailing list<br>
>> <a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
>> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
>><br>
><br>
> --<br>
> Hal Finkel<br>
> Assistant Computational Scientist<br>
> Leadership Computing Facility<br>
> Argonne National Laboratory<br>
<br>
</div></div></blockquote></div><br></div></div>