<div dir="ltr">I ended up reverting this, and posted a reduced test case for it in <a href="https://llvm.org/">https://llvm.org/</a><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:12.8px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">PR35519. I think memdep needs some more work. I'm concerned that memcpyopt (or maybe memdep) is sensitive to debug info.</span></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Dec 23, 2017 at 7:13 AM, Hans Wennborg <span dir="ltr"><<a href="mailto:hans@chromium.org" target="_blank">hans@chromium.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">This caused Chrome (on Mac) to crash on start-up: <a href="http://crbug.com/797267" rel="noreferrer" target="_blank">http://crbug.com/797267</a><br>
<br>
We don't seem to have a good analysis of the problem yet, but perhaps<br>
we should revert this before it bites others?<br>
<br>
In any case, this is a heads up that there might be a problem here.<br>
<div class="HOEnZb"><div class="h5"><br>
On Tue, Dec 19, 2017 at 5:36 PM, Dan Gohman via llvm-commits<br>
<<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>> wrote:<br>
> Author: djg<br>
> Date: Tue Dec 19 17:36:25 2017<br>
> New Revision: 321138<br>
><br>
> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=321138&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project?rev=321138&view=rev</a><br>
> Log:<br>
> [memcpyopt] Teach memcpyopt to optimize across basic blocks<br>
><br>
> This teaches memcpyopt to make a non-local memdep query when a local query<br>
> indicates that the dependency is non-local. This notably allows it to<br>
> eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%.<br>
><br>
> This is r319482 and r319483, along with fixes for PR35519: fix the<br>
> optimization that merges stores into memsets to preserve cached memdep<br>
> info, and fix memdep's non-local caching strategy to not assume that larger<br>
> queries are always more conservative than smaller ones.<br>
><br>
> Fixes PR28958 and PR35519.<br>
><br>
> Differential Revision: <a href="https://reviews.llvm.org/D40802" rel="noreferrer" target="_blank">https://reviews.llvm.org/<wbr>D40802</a><br>
><br>
> Added:<br>
> llvm/trunk/test/Transforms/<wbr>MemCpyOpt/memcpy-invoke-<wbr>memcpy.ll<br>
> llvm/trunk/test/Transforms/<wbr>MemCpyOpt/merge-into-memset.ll<br>
> llvm/trunk/test/Transforms/<wbr>MemCpyOpt/mixed-sizes.ll<br>
> llvm/trunk/test/Transforms/<wbr>MemCpyOpt/nonlocal-memcpy-<wbr>memcpy.ll<br>
> Modified:<br>
> llvm/trunk/include/llvm/<wbr>Analysis/<wbr>MemoryDependenceAnalysis.h<br>
> llvm/trunk/lib/Analysis/<wbr>MemoryDependenceAnalysis.cpp<br>
> llvm/trunk/lib/Transforms/<wbr>Scalar/MemCpyOptimizer.cpp<br>
><br>
> Modified: llvm/trunk/include/llvm/<wbr>Analysis/<wbr>MemoryDependenceAnalysis.h<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h?rev=321138&r1=321137&r2=321138&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/include/<wbr>llvm/Analysis/<wbr>MemoryDependenceAnalysis.h?<wbr>rev=321138&r1=321137&r2=<wbr>321138&view=diff</a><br>
> ==============================<wbr>==============================<wbr>==================<br>
> --- llvm/trunk/include/llvm/<wbr>Analysis/<wbr>MemoryDependenceAnalysis.h (original)<br>
> +++ llvm/trunk/include/llvm/<wbr>Analysis/<wbr>MemoryDependenceAnalysis.h Tue Dec 19 17:36:25 2017<br>
> @@ -407,6 +407,12 @@ public:<br>
> void getNonLocalPointerDependency(<wbr>Instruction *QueryInst,<br>
> SmallVectorImpl<<wbr>NonLocalDepResult> &Result);<br>
><br>
> + /// Perform a dependency query specifically for QueryInst's access to Loc.<br>
> + /// The other comments for getNonLocalPointerDependency apply here as well.<br>
> + void getNonLocalPointerDependencyFr<wbr>om(Instruction *QueryInst,<br>
> + const MemoryLocation &Loc, bool isLoad,<br>
> + SmallVectorImpl<<wbr>NonLocalDepResult> &Result);<br>
> +<br>
> /// Removes an instruction from the dependence analysis, updating the<br>
> /// dependence of instructions that previously depended on it.<br>
> void removeInstruction(Instruction *InstToRemove);<br>
><br>
> Modified: llvm/trunk/lib/Analysis/<wbr>MemoryDependenceAnalysis.cpp<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=321138&r1=321137&r2=321138&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/<wbr>Analysis/<wbr>MemoryDependenceAnalysis.cpp?<wbr>rev=321138&r1=321137&r2=<wbr>321138&view=diff</a><br>
> ==============================<wbr>==============================<wbr>==================<br>
> --- llvm/trunk/lib/Analysis/<wbr>MemoryDependenceAnalysis.cpp (original)<br>
> +++ llvm/trunk/lib/Analysis/<wbr>MemoryDependenceAnalysis.cpp Tue Dec 19 17:36:25 2017<br>
> @@ -919,6 +919,14 @@ void MemoryDependenceResults::<wbr>getNonLoca<br>
> Instruction *QueryInst, SmallVectorImpl<<wbr>NonLocalDepResult> &Result) {<br>
> const MemoryLocation Loc = MemoryLocation::get(QueryInst)<wbr>;<br>
> bool isLoad = isa<LoadInst>(QueryInst);<br>
> + return getNonLocalPointerDependencyFr<wbr>om(QueryInst, Loc, isLoad, Result);<br>
> +}<br>
> +<br>
> +void MemoryDependenceResults::<wbr>getNonLocalPointerDependencyFr<wbr>om(<br>
> + Instruction *QueryInst,<br>
> + const MemoryLocation &Loc,<br>
> + bool isLoad,<br>
> + SmallVectorImpl<<wbr>NonLocalDepResult> &Result) {<br>
> BasicBlock *FromBB = QueryInst->getParent();<br>
> assert(FromBB);<br>
><br>
> @@ -1118,21 +1126,15 @@ bool MemoryDependenceResults::<wbr>getNonLoca<br>
> // If we already have a cache entry for this CacheKey, we may need to do some<br>
> // work to reconcile the cache entry and the current query.<br>
> if (!Pair.second) {<br>
> - if (CacheInfo->Size < Loc.Size) {<br>
> - // The query's Size is greater than the cached one. Throw out the<br>
> - // cached data and proceed with the query at the greater size.<br>
> + if (CacheInfo->Size != Loc.Size) {<br>
> + // The query's Size differs from the cached one. Throw out the<br>
> + // cached data and proceed with the query at the new size.<br>
> CacheInfo->Pair = BBSkipFirstBlockPair();<br>
> CacheInfo->Size = Loc.Size;<br>
> for (auto &Entry : CacheInfo->NonLocalDeps)<br>
> if (Instruction *Inst = Entry.getResult().getInst())<br>
> RemoveFromReverseMap(<wbr>ReverseNonLocalPtrDeps, Inst, CacheKey);<br>
> CacheInfo->NonLocalDeps.clear(<wbr>);<br>
> - } else if (CacheInfo->Size > Loc.Size) {<br>
> - // This query's Size is less than the cached one. Conservatively restart<br>
> - // the query using the greater size.<br>
> - return getNonLocalPointerDepFromBB(<br>
> - QueryInst, Pointer, Loc.getWithNewSize(CacheInfo-><wbr>Size), isLoad,<br>
> - StartBB, Result, Visited, SkipFirstBlock);<br>
> }<br>
><br>
> // If the query's AATags are inconsistent with the cached one,<br>
><br>
> Modified: llvm/trunk/lib/Transforms/<wbr>Scalar/MemCpyOptimizer.cpp<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/MemCpyOptimizer.cpp?rev=321138&r1=321137&r2=321138&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/<wbr>Transforms/Scalar/<wbr>MemCpyOptimizer.cpp?rev=<wbr>321138&r1=321137&r2=321138&<wbr>view=diff</a><br>
> ==============================<wbr>==============================<wbr>==================<br>
> --- llvm/trunk/lib/Transforms/<wbr>Scalar/MemCpyOptimizer.cpp (original)<br>
> +++ llvm/trunk/lib/Transforms/<wbr>Scalar/MemCpyOptimizer.cpp Tue Dec 19 17:36:25 2017<br>
> @@ -476,22 +476,33 @@ Instruction *MemCpyOptPass::tryMergingIn<br>
> Alignment = DL.getABITypeAlignment(<wbr>EltType);<br>
> }<br>
><br>
> - AMemSet =<br>
> - Builder.CreateMemSet(StartPtr, ByteVal, Range.End-Range.Start, Alignment);<br>
> + // Remember the debug location.<br>
> + DebugLoc Loc;<br>
> + if (!Range.TheStores.empty())<br>
> + Loc = Range.TheStores[0]-><wbr>getDebugLoc();<br>
><br>
> DEBUG(dbgs() << "Replace stores:\n";<br>
> for (Instruction *SI : Range.TheStores)<br>
> - dbgs() << *SI << '\n';<br>
> - dbgs() << "With: " << *AMemSet << '\n');<br>
> -<br>
> - if (!Range.TheStores.empty())<br>
> - AMemSet->setDebugLoc(Range.<wbr>TheStores[0]->getDebugLoc());<br>
> + dbgs() << *SI << '\n');<br>
><br>
> // Zap all the stores.<br>
> for (Instruction *SI : Range.TheStores) {<br>
> MD->removeInstruction(SI);<br>
> SI->eraseFromParent();<br>
> }<br>
> +<br>
> + // Create the memset after removing the stores, so that if there any cached<br>
> + // non-local dependencies on the removed instructions in<br>
> + // MemoryDependenceAnalysis, the cache entries are updated to "dirty"<br>
> + // entries pointing below the memset, so subsequent queries include the<br>
> + // memset.<br>
> + AMemSet =<br>
> + Builder.CreateMemSet(StartPtr, ByteVal, Range.End-Range.Start, Alignment);<br>
> + if (!Range.TheStores.empty())<br>
> + AMemSet->setDebugLoc(Loc);<br>
> +<br>
> + DEBUG(dbgs() << "With: " << *AMemSet << '\n');<br>
> +<br>
> ++NumMemSetInfer;<br>
> }<br>
><br>
> @@ -1031,9 +1042,22 @@ bool MemCpyOptPass::<wbr>processMemCpyMemCpyD<br>
> //<br>
> // NOTE: This is conservative, it will stop on any read from the source loc,<br>
> // not just the defining memcpy.<br>
> - MemDepResult SourceDep =<br>
> - MD->getPointerDependencyFrom(<wbr>MemoryLocation::getForSource(<wbr>MDep), false,<br>
> - M->getIterator(), M->getParent());<br>
> + MemoryLocation SourceLoc = MemoryLocation::getForSource(<wbr>MDep);<br>
> + MemDepResult SourceDep = MD->getPointerDependencyFrom(<wbr>SourceLoc, false,<br>
> + M->getIterator(), M->getParent());<br>
> +<br>
> + if (SourceDep.isNonLocal()) {<br>
> + SmallVector<NonLocalDepResult, 2> NonLocalDepResults;<br>
> + MD-><wbr>getNonLocalPointerDependencyFr<wbr>om(M, SourceLoc, /*isLoad=*/false,<br>
> + NonLocalDepResults);<br>
> + if (NonLocalDepResults.size() == 1) {<br>
> + SourceDep = NonLocalDepResults[0].<wbr>getResult();<br>
> + assert((!SourceDep.getInst() ||<br>
> + LookupDomTree().dominates(<wbr>SourceDep.getInst(), M)) &&<br>
> + "when memdep returns exactly one result, it should dominate");<br>
> + }<br>
> + }<br>
> +<br>
> if (!SourceDep.isClobber() || SourceDep.getInst() != MDep)<br>
> return false;<br>
><br>
> @@ -1235,6 +1259,18 @@ bool MemCpyOptPass::processMemCpy(<wbr>MemCpy<br>
> MemDepResult SrcDepInfo = MD->getPointerDependencyFrom(<br>
> SrcLoc, true, M->getIterator(), M->getParent());<br>
><br>
> + if (SrcDepInfo.isNonLocal()) {<br>
> + SmallVector<NonLocalDepResult, 2> NonLocalDepResults;<br>
> + MD-><wbr>getNonLocalPointerDependencyFr<wbr>om(M, SrcLoc, /*isLoad=*/true,<br>
> + NonLocalDepResults);<br>
> + if (NonLocalDepResults.size() == 1) {<br>
> + SrcDepInfo = NonLocalDepResults[0].<wbr>getResult();<br>
> + assert((!SrcDepInfo.getInst() ||<br>
> + LookupDomTree().dominates(<wbr>SrcDepInfo.getInst(), M)) &&<br>
> + "when memdep returns exactly one result, it should dominate");<br>
> + }<br>
> + }<br>
> +<br>
> if (SrcDepInfo.isClobber()) {<br>
> if (MemCpyInst *MDep = dyn_cast<MemCpyInst>(<wbr>SrcDepInfo.getInst()))<br>
> return processMemCpyMemCpyDependence(<wbr>M, MDep);<br>
><br>
> Added: llvm/trunk/test/Transforms/<wbr>MemCpyOpt/memcpy-invoke-<wbr>memcpy.ll<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/MemCpyOpt/memcpy-invoke-memcpy.ll?rev=321138&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/test/<wbr>Transforms/MemCpyOpt/memcpy-<wbr>invoke-memcpy.ll?rev=321138&<wbr>view=auto</a><br>
> ==============================<wbr>==============================<wbr>==================<br>
> --- llvm/trunk/test/Transforms/<wbr>MemCpyOpt/memcpy-invoke-<wbr>memcpy.ll (added)<br>
> +++ llvm/trunk/test/Transforms/<wbr>MemCpyOpt/memcpy-invoke-<wbr>memcpy.ll Tue Dec 19 17:36:25 2017<br>
> @@ -0,0 +1,48 @@<br>
> +; RUN: opt < %s -memcpyopt -S | FileCheck %s<br>
> +; Test memcpy-memcpy dependencies across invoke edges.<br>
> +<br>
> +; Test that memcpyopt works across the non-unwind edge of an invoke.<br>
> +<br>
> +define hidden void @test_normal(i8* noalias %dst, i8* %src) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {<br>
> +entry:<br>
> + %temp = alloca i8, i32 64<br>
> + call void @llvm.memcpy.p0i8.p0i8.i64(i8* %temp, i8* nonnull %src, i64 64, i32 8, i1 false)<br>
> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* %temp, i8* nonnull %src, i64 64, i32 8, i1 false)<br>
> + invoke void @invoke_me()<br>
> + to label %try.cont unwind label %lpad<br>
> +<br>
> +lpad:<br>
> + landingpad { i8*, i32 }<br>
> + catch i8* null<br>
> + ret void<br>
> +<br>
> +try.cont:<br>
> + call void @llvm.memcpy.p0i8.p0i8.i64(i8* %dst, i8* %temp, i64 64, i32 8, i1 false)<br>
> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* %dst, i8* %src, i64 64, i32 8, i1 false)<br>
> + ret void<br>
> +}<br>
> +<br>
> +; Test that memcpyopt works across the unwind edge of an invoke.<br>
> +<br>
> +define hidden void @test_unwind(i8* noalias %dst, i8* %src) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {<br>
> +entry:<br>
> + %temp = alloca i8, i32 64<br>
> + call void @llvm.memcpy.p0i8.p0i8.i64(i8* %temp, i8* nonnull %src, i64 64, i32 8, i1 false)<br>
> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* %temp, i8* nonnull %src, i64 64, i32 8, i1 false)<br>
> + invoke void @invoke_me()<br>
> + to label %try.cont unwind label %lpad<br>
> +<br>
> +lpad:<br>
> + landingpad { i8*, i32 }<br>
> + catch i8* null<br>
> + call void @llvm.memcpy.p0i8.p0i8.i64(i8* %dst, i8* %temp, i64 64, i32 8, i1 false)<br>
> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* %dst, i8* %src, i64 64, i32 8, i1 false)<br>
> + ret void<br>
> +<br>
> +try.cont:<br>
> + ret void<br>
> +}<br>
> +<br>
> +declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture readonly, i64, i32, i1)<br>
> +declare i32 @__gxx_personality_v0(...)<br>
> +declare void @invoke_me() readnone<br>
><br>
> Added: llvm/trunk/test/Transforms/<wbr>MemCpyOpt/merge-into-memset.ll<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/MemCpyOpt/merge-into-memset.ll?rev=321138&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/test/<wbr>Transforms/MemCpyOpt/merge-<wbr>into-memset.ll?rev=321138&<wbr>view=auto</a><br>
> ==============================<wbr>==============================<wbr>==================<br>
> --- llvm/trunk/test/Transforms/<wbr>MemCpyOpt/merge-into-memset.ll (added)<br>
> +++ llvm/trunk/test/Transforms/<wbr>MemCpyOpt/merge-into-memset.ll Tue Dec 19 17:36:25 2017<br>
> @@ -0,0 +1,45 @@<br>
> +; RUN: opt < %s -memcpyopt -S | FileCheck %s<br>
> +; Update cached non-local dependence information when merging stores into memset.<br>
> +<br>
> +target datalayout = "e-m:e-i64:64-f80:128-n8:16:<wbr>32:64-S128"<br>
> +<br>
> +; Don't delete the memcpy in %if.then, even though it depends on an instruction<br>
> +; which will be deleted.<br>
> +<br>
> +; CHECK-LABEL: @foo<br>
> +define void @foo(i1 %c, i8* %d, i8* %e, i8* %f) {<br>
> +entry:<br>
> + %tmp = alloca [50 x i8], align 8<br>
> + %tmp4 = bitcast [50 x i8]* %tmp to i8*<br>
> + %tmp1 = getelementptr inbounds i8, i8* %tmp4, i64 1<br>
> + call void @llvm.memset.p0i8.i64(i8* nonnull %d, i8 0, i64 10, i32 1, i1 false), !dbg !5<br>
> + store i8 0, i8* %tmp4, align 8, !dbg !5<br>
> +; CHECK: call void @llvm.memset.p0i8.i64(i8* nonnull %d, i8 0, i64 10, i32 1, i1 false), !dbg !5<br>
> + call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull %tmp1, i8* nonnull %d, i64 10, i32 1, i1 false)<br>
> + br i1 %c, label %if.then, label %exit<br>
> +<br>
> +if.then:<br>
> +; CHECK: if.then:<br>
> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* %f, i8* nonnull %tmp4, i64 30, i32 8, i1 false)<br>
> + call void @llvm.memcpy.p0i8.p0i8.i64(i8* %f, i8* nonnull %tmp4, i64 30, i32 8, i1 false)<br>
> + br label %exit<br>
> +<br>
> +exit:<br>
> + ret void<br>
> +}<br>
> +<br>
> +declare void @llvm.memcpy.p0i8.p0i8.i64(i8*<wbr>, i8*, i64, i32, i1)<br>
> +declare void @llvm.memset.p0i8.i64(i8*, i8, i64, i32, i1)<br>
> +<br>
> +!<a href="http://llvm.dbg.cu" rel="noreferrer" target="_blank">llvm.dbg.cu</a> = !{!0}<br>
> +!llvm.module.flags = !{!3, !4}<br>
> +<br>
> +!0 = distinct !DICompileUnit(language: DW_LANG_Rust, file: !1, isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2)<br>
> +!1 = !DIFile(filename: "<a href="http://t.rs" rel="noreferrer" target="_blank">t.rs</a>", directory: "/tmp")<br>
> +!2 = !{}<br>
> +!3 = !{i32 2, !"Dwarf Version", i32 4}<br>
> +!4 = !{i32 2, !"Debug Info Version", i32 3}<br>
> +!5 = !DILocation(line: 8, column: 5, scope: !6)<br>
> +!6 = distinct !DISubprogram(name: "bar", scope: !1, file: !1, line: 5, type: !7, isLocal: false, isDefinition: true, scopeLine: 5, flags: DIFlagPrototyped, isOptimized: false, unit: !0, variables: !2)<br>
> +!7 = !DISubroutineType(types: !8)<br>
> +!8 = !{null}<br>
><br>
> Added: llvm/trunk/test/Transforms/<wbr>MemCpyOpt/mixed-sizes.ll<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/MemCpyOpt/mixed-sizes.ll?rev=321138&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/test/<wbr>Transforms/MemCpyOpt/mixed-<wbr>sizes.ll?rev=321138&view=auto</a><br>
> ==============================<wbr>==============================<wbr>==================<br>
> --- llvm/trunk/test/Transforms/<wbr>MemCpyOpt/mixed-sizes.ll (added)<br>
> +++ llvm/trunk/test/Transforms/<wbr>MemCpyOpt/mixed-sizes.ll Tue Dec 19 17:36:25 2017<br>
> @@ -0,0 +1,36 @@<br>
> +; RUN: opt < %s -memcpyopt -S | FileCheck %s<br>
> +; Handle memcpy-memcpy dependencies of differing sizes correctly.<br>
> +<br>
> +target datalayout = "e-m:e-i64:64-f80:128-n8:16:<wbr>32:64-S128"<br>
> +<br>
> +; Don't delete the second memcpy, even though there's an earlier<br>
> +; memcpy with a larger size from the same address.<br>
> +<br>
> +; CHECK-LABEL: @foo<br>
> +define i32 @foo(i1 %z) {<br>
> +entry:<br>
> + %a = alloca [10 x i32]<br>
> + %s = alloca [10 x i32]<br>
> + %0 = bitcast [10 x i32]* %a to i8*<br>
> + %1 = bitcast [10 x i32]* %s to i8*<br>
> + call void @llvm.memset.p0i8.i64(i8* nonnull %1, i8 0, i64 40, i32 16, i1 false)<br>
> + %arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* %a, i64 0, i64 0<br>
> + store i32 1, i32* %arrayidx<br>
> + %scevgep = getelementptr [10 x i32], [10 x i32]* %s, i64 0, i64 1<br>
> + %scevgep7 = bitcast i32* %scevgep to i8*<br>
> + br i1 %z, label %<a href="http://for.body3.lr.ph" rel="noreferrer" target="_blank">for.body3.lr.ph</a>, label %for.inc7.1<br>
> +<br>
> +<a href="http://for.body3.lr.ph" rel="noreferrer" target="_blank">for.body3.lr.ph</a>: ; preds = %entry<br>
> + call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %scevgep7, i64 <a href="tel:17179869180" value="+17179869180">17179869180</a>, i32 4, i1 false)<br>
> + br label %for.inc7.1<br>
> +<br>
> +for.inc7.1:<br>
> +; CHECK: for.inc7.1:<br>
> + call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %scevgep7, i64 4, i32 4, i1 false)<br>
> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %scevgep7, i64 4, i32 4, i1 false)<br>
> + %2 = load i32, i32* %arrayidx<br>
> + ret i32 %2<br>
> +}<br>
> +<br>
> +declare void @llvm.memcpy.p0i8.p0i8.i64(i8*<wbr>, i8*, i64, i32, i1)<br>
> +declare void @llvm.memset.p0i8.i64(i8*, i8, i64, i32, i1)<br>
><br>
> Added: llvm/trunk/test/Transforms/<wbr>MemCpyOpt/nonlocal-memcpy-<wbr>memcpy.ll<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/MemCpyOpt/nonlocal-memcpy-memcpy.ll?rev=321138&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/test/<wbr>Transforms/MemCpyOpt/nonlocal-<wbr>memcpy-memcpy.ll?rev=321138&<wbr>view=auto</a><br>
> ==============================<wbr>==============================<wbr>==================<br>
> --- llvm/trunk/test/Transforms/<wbr>MemCpyOpt/nonlocal-memcpy-<wbr>memcpy.ll (added)<br>
> +++ llvm/trunk/test/Transforms/<wbr>MemCpyOpt/nonlocal-memcpy-<wbr>memcpy.ll Tue Dec 19 17:36:25 2017<br>
> @@ -0,0 +1,114 @@<br>
> +; RUN: opt < %s -memcpyopt -S | FileCheck %s<br>
> +; Make sure memcpy-memcpy dependence is optimized across<br>
> +; basic blocks (conditional branches and invokes).<br>
> +<br>
> +%struct.s = type { i32, i32 }<br>
> +<br>
> +@s_foo = private unnamed_addr constant %struct.s { i32 1, i32 2 }, align 4<br>
> +@s_baz = private unnamed_addr constant %struct.s { i32 1, i32 2 }, align 4<br>
> +@i = external constant i8*<br>
> +<br>
> +declare void @qux()<br>
> +declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture readonly, i64, i32, i1)<br>
> +declare void @__cxa_throw(i8*, i8*, i8*)<br>
> +declare i32 @__gxx_personality_v0(...)<br>
> +declare i8* @__cxa_begin_catch(i8*)<br>
> +<br>
> +; A simple partial redundancy. Test that the second memcpy is optimized<br>
> +; to copy directly from the original source rather than from the temporary.<br>
> +<br>
> +; CHECK-LABEL: @wobble<br>
> +define void @wobble(i8* noalias %dst, i8* %src, i1 %some_condition) {<br>
> +bb:<br>
> + %temp = alloca i8, i32 64<br>
> + call void @llvm.memcpy.p0i8.p0i8.i64(i8* %temp, i8* nonnull %src, i64 64, i32 8, i1 false)<br>
> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* %temp, i8* nonnull %src, i64 64, i32 8, i1 false)<br>
> + br i1 %some_condition, label %more, label %out<br>
> +<br>
> +out:<br>
> + call void @qux()<br>
> + unreachable<br>
> +<br>
> +more:<br>
> + call void @llvm.memcpy.p0i8.p0i8.i64(i8* %dst, i8* %temp, i64 64, i32 8, i1 false)<br>
> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* %dst, i8* %src, i64 64, i32 8, i1 false)<br>
> + ret void<br>
> +}<br>
> +<br>
> +; A CFG triangle with a partial redundancy targeting an alloca. Test that the<br>
> +; memcpy inside the triangle is optimized to copy directly from the original<br>
> +; source rather than from the temporary.<br>
> +<br>
> +; CHECK-LABEL: @foo<br>
> +define i32 @foo(i1 %t3) {<br>
> +bb:<br>
> + %s = alloca %struct.s, align 4<br>
> + %t = alloca %struct.s, align 4<br>
> + %s1 = bitcast %struct.s* %s to i8*<br>
> + call void @llvm.memcpy.p0i8.p0i8.i64(i8* %s1, i8* bitcast (%struct.s* @s_foo to i8*), i64 8, i32 4, i1 false)<br>
> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* %s1, i8* bitcast (%struct.s* @s_foo to i8*), i64 8, i32 4, i1 false)<br>
> + br i1 %t3, label %bb4, label %bb7<br>
> +<br>
> +bb4: ; preds = %bb<br>
> + %t5 = bitcast %struct.s* %t to i8*<br>
> + %s6 = bitcast %struct.s* %s to i8*<br>
> + call void @llvm.memcpy.p0i8.p0i8.i64(i8* %t5, i8* %s6, i64 8, i32 4, i1 false)<br>
> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* %t5, i8* bitcast (%struct.s* @s_foo to i8*), i64 8, i32 4, i1 false)<br>
> + br label %bb7<br>
> +<br>
> +bb7: ; preds = %bb4, %bb<br>
> + %t8 = getelementptr %struct.s, %struct.s* %t, i32 0, i32 0<br>
> + %t9 = load i32, i32* %t8, align 4<br>
> + %t10 = getelementptr %struct.s, %struct.s* %t, i32 0, i32 1<br>
> + %t11 = load i32, i32* %t10, align 4<br>
> + %t12 = add i32 %t9, %t11<br>
> + ret i32 %t12<br>
> +}<br>
> +<br>
> +; A CFG diamond with an invoke on one side, and a partially redundant memcpy<br>
> +; into an alloca on the other. Test that the memcpy inside the diamond is<br>
> +; optimized to copy ; directly from the original source rather than from the<br>
> +; temporary. This more complex test represents a relatively common usage<br>
> +; pattern.<br>
> +<br>
> +; CHECK-LABEL: @baz<br>
> +define i32 @baz(i1 %t5) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {<br>
> +bb:<br>
> + %s = alloca %struct.s, align 4<br>
> + %t = alloca %struct.s, align 4<br>
> + %s3 = bitcast %struct.s* %s to i8*<br>
> + call void @llvm.memcpy.p0i8.p0i8.i64(i8* %s3, i8* bitcast (%struct.s* @s_baz to i8*), i64 8, i32 4, i1 false)<br>
> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* %s3, i8* bitcast (%struct.s* @s_baz to i8*), i64 8, i32 4, i1 false)<br>
> + br i1 %t5, label %bb6, label %bb22<br>
> +<br>
> +bb6: ; preds = %bb<br>
> + invoke void @__cxa_throw(i8* null, i8* bitcast (i8** @i to i8*), i8* null)<br>
> + to label %bb25 unwind label %bb9<br>
> +<br>
> +bb9: ; preds = %bb6<br>
> + %t10 = landingpad { i8*, i32 }<br>
> + catch i8* null<br>
> + br label %bb13<br>
> +<br>
> +bb13: ; preds = %bb9<br>
> + %t15 = call i8* @__cxa_begin_catch(i8* null)<br>
> + br label %bb23<br>
> +<br>
> +bb22: ; preds = %bb<br>
> + %t23 = bitcast %struct.s* %t to i8*<br>
> + %s24 = bitcast %struct.s* %s to i8*<br>
> + call void @llvm.memcpy.p0i8.p0i8.i64(i8* %t23, i8* %s24, i64 8, i32 4, i1 false)<br>
> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* %t23, i8* bitcast (%struct.s* @s_baz to i8*), i64 8, i32 4, i1 false)<br>
> + br label %bb23<br>
> +<br>
> +bb23: ; preds = %bb22, %bb13<br>
> + %t17 = getelementptr inbounds %struct.s, %struct.s* %t, i32 0, i32 0<br>
> + %t18 = load i32, i32* %t17, align 4<br>
> + %t19 = getelementptr inbounds %struct.s, %struct.s* %t, i32 0, i32 1<br>
> + %t20 = load i32, i32* %t19, align 4<br>
> + %t21 = add nsw i32 %t18, %t20<br>
> + ret i32 %t21<br>
> +<br>
> +bb25: ; preds = %bb6<br>
> + unreachable<br>
> +}<br>
><br>
><br>
> ______________________________<wbr>_________________<br>
> llvm-commits mailing list<br>
> <a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-commits</a><br>
</div></div></blockquote></div><br></div>