[PATCH] Eliminate memcpy of undefined values
Patrick Walton
pcwalton at mozilla.com
Tue Feb 4 19:33:43 PST 2014
Hi everyone,
This patch optimizes out memcpy operations that copy undefined data from
fresh allocas. Since the data was already undefined, we can simply not
do the memcpy and leave the data that was already there in place. SROA
frequently creates these memcpys when aggregates that contain padding
are copied around, so this eliminates a significant chunk (0.5% or so)
of code from Rust binaries.
Thanks!
Patrick
-------------- next part --------------
commit 26bdfd170ea4ad439ac51c5618d6ba8c3d8e6ed1
Author: Patrick Walton <pcwalton at mimiga.net>
Date: Tue Feb 4 17:32:23 2014 -0800
Optimize out memcpy operations that copy undefined data from fresh allocas.
Since the data is undefined, we can simply not do the memcpy and leave the data
that was already there. This helps clean up code that SROA can leave around.
diff --git a/lib/Transforms/Scalar/MemCpyOptimizer.cpp b/lib/Transforms/Scalar/MemCpyOptimizer.cpp
index ea9f57c..0d6af5b 100644
--- a/lib/Transforms/Scalar/MemCpyOptimizer.cpp
+++ b/lib/Transforms/Scalar/MemCpyOptimizer.cpp
@@ -840,9 +840,12 @@ bool MemCpyOpt::processMemCpy(MemCpyInst *M) {
return true;
}
- // The are two possible optimizations we can do for memcpy:
+ // The are three possible optimizations we can do for memcpy:
// a) memcpy-memcpy xform which exposes redundance for DSE.
// b) call-memcpy xform for return slot optimization.
+ // c) memcpy from freshly alloca'd space copies undefined data, and we can
+ // therefore eliminate the memcpy in favor of the data that was already
+ // at the destination.
MemDepResult DepInfo = MD->getDependency(M);
if (DepInfo.isClobber()) {
if (CallInst *C = dyn_cast<CallInst>(DepInfo.getInst())) {
@@ -862,6 +865,13 @@ bool MemCpyOpt::processMemCpy(MemCpyInst *M) {
if (SrcDepInfo.isClobber()) {
if (MemCpyInst *MDep = dyn_cast<MemCpyInst>(SrcDepInfo.getInst()))
return processMemCpyMemCpyDependence(M, MDep, CopySize->getZExtValue());
+ } else if (SrcDepInfo.isDef()) {
+ if (isa<AllocaInst>(SrcDepInfo.getInst())) {
+ MD->removeInstruction(M);
+ M->eraseFromParent();
+ ++NumMemCpyInstr;
+ return true;
+ }
}
return false;
diff --git a/test/Transforms/MemCpyOpt/memcpy-undef.ll b/test/Transforms/MemCpyOpt/memcpy-undef.ll
new file mode 100644
index 0000000..0536427
--- /dev/null
+++ b/test/Transforms/MemCpyOpt/memcpy-undef.ll
@@ -0,0 +1,25 @@
+; RUN: opt < %s -basicaa -memcpyopt -dse -S | FileCheck %s
+
+target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
+target triple = "x86_64-apple-macosx10.8.0"
+
+%struct.foo = type { i8, [7 x i8], i32 }
+
+define i32 @test1(%struct.foo* nocapture %foobie) nounwind noinline ssp uwtable {
+ %bletch.sroa.1 = alloca [7 x i8], align 1
+ %1 = getelementptr inbounds %struct.foo* %foobie, i64 0, i32 0
+ store i8 98, i8* %1, align 4
+ %2 = getelementptr inbounds %struct.foo* %foobie, i64 0, i32 1, i64 0
+ %3 = getelementptr inbounds [7 x i8]* %bletch.sroa.1, i64 0, i64 0
+ call void @llvm.memcpy.p0i8.p0i8.i64(i8* %2, i8* %3, i64 7, i32 1, i1 false)
+ %4 = getelementptr inbounds %struct.foo* %foobie, i64 0, i32 2
+ store i32 20, i32* %4, align 4
+ ret i32 undef
+
+; Check that the memcpy is removed.
+; CHECK-LABEL: @test1(
+; CHECK-NOT: call void @llvm.memcpy
+}
+
+declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture, i64, i32, i1) nounwind
+
More information about the llvm-commits
mailing list