<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Hi Meador, <div><br></div><div>I am now working on improving the clang compile time and I ran into a severe compile time regression in InstCombine and I think that it is due to your LibCallOptimization changes.  The problem is that InstCombiner::visitCallSite() calls LibCallSimplifier::initOptimizations().  InitOptimizations inserts dozens of function names into string map. It rehashes the map and allocates memory very often, and this initialization takes a long time.  I am profiling a typical C++ programs that uses STL, and I see lots of functions don't do much beside calling other functions. I already optimized the memory allocation, but not the insertion into the StringMap. On my program, initOptimizations takes 13.5% of the execution time of InstCombine.  I understand that we only initialize this data structure once per function, but many c++ programs have lots of small functions and the cost of initialization is still high. Can you please look into this problem ? </div><div><br></div><div>Thanks,</div><div>Nadav</div><div> </div><div><br></div><div><br><div><div>On Nov 29, 2012, at 7:45 AM, Meador Inge <<a href="mailto:meadori@codesourcery.com">meadori@codesourcery.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;">Author: meadori<br>Date: Thu Nov 29 09:45:43 2012<br>New Revision: 168893<br><br>URL:<span class="Apple-converted-space"> </span><a href="http://llvm.org/viewvc/llvm-project?rev=168893&view=rev">http://llvm.org/viewvc/llvm-project?rev=168893&view=rev</a><br>Log:<br>instcombine: Migrate fputs optimizations<br><br>This patch migrates the fputs optimizations from the simplify-libcalls<br>pass into the instcombine library call simplifier.<br><br>Added:<br>   llvm/trunk/test/Transforms/InstCombine/fputs-1.ll<br>Removed:<br>   llvm/trunk/test/Transforms/SimplifyLibCalls/FPuts.ll<br>Modified:<br>   llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp<br>   llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp<br>   llvm/trunk/test/Transforms/InstCombine/fprintf-1.ll<br><br>Modified: llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp<br>URL:<span class="Apple-converted-space"> </span><a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp?rev=168893&r1=168892&r2=168893&view=diff">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp?rev=168893&r1=168892&r2=168893&view=diff</a><br>==============================================================================<br>--- llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp (original)<br>+++ llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp Thu Nov 29 09:45:43 2012<br>@@ -87,31 +87,6 @@<br>//===----------------------------------------------------------------------===//<br><br>//===---------------------------------------===//<br>-// 'fputs' Optimizations<br>-<br>-struct FPutsOpt : public LibCallOptimization {<br>-  virtual Value *CallOptimizer(Function *Callee, CallInst *CI, IRBuilder<> &B) {<br>-    // These optimizations require DataLayout.<br>-    if (!TD) return 0;<br>-<br>-    // Require two pointers.  Also, we can't optimize if return value is used.<br>-    FunctionType *FT = Callee->getFunctionType();<br>-    if (FT->getNumParams() != 2 || !FT->getParamType(0)->isPointerTy() ||<br>-        !FT->getParamType(1)->isPointerTy() ||<br>-        !CI->use_empty())<br>-      return 0;<br>-<br>-    // fputs(s,F) --> fwrite(s,1,strlen(s),F)<br>-    uint64_t Len = GetStringLength(CI->getArgOperand(0));<br>-    if (!Len) return 0;<br>-    // Known to have no uses (see above).<br>-    return EmitFWrite(CI->getArgOperand(0),<br>-                      ConstantInt::get(TD->getIntPtrType(*Context), Len-1),<br>-                      CI->getArgOperand(1), B, TD, TLI);<br>-  }<br>-};<br>-<br>-//===---------------------------------------===//<br>// 'puts' Optimizations<br><br>struct PutsOpt : public LibCallOptimization {<br>@@ -153,7 +128,6 @@<br><br>    StringMap<LibCallOptimization*> Optimizations;<br>    // Formatting and IO Optimizations<br>-    FPutsOpt FPuts;<br>    PutsOpt Puts;<br><br>    bool Modified;  // This is only used by doInitialization.<br>@@ -210,7 +184,6 @@<br>/// we know.<br>void SimplifyLibCalls::InitOptimizations() {<br>  // Formatting and IO Optimizations<br>-  AddOpt(LibFunc::fputs, &FPuts);<br>  Optimizations["puts"] = &Puts;<br>}<br><br><br>Modified: llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp<br>URL:<span class="Apple-converted-space"> </span><a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp?rev=168893&r1=168892&r2=168893&view=diff">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp?rev=168893&r1=168892&r2=168893&view=diff</a><br>==============================================================================<br>--- llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp (original)<br>+++ llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp Thu Nov 29 09:45:43 2012<br>@@ -1610,6 +1610,28 @@<br>  }<br>};<br><br>+struct FPutsOpt : public LibCallOptimization {<br>+  virtual Value *callOptimizer(Function *Callee, CallInst *CI, IRBuilder<> &B) {<br>+    // These optimizations require DataLayout.<br>+    if (!TD) return 0;<br>+<br>+    // Require two pointers.  Also, we can't optimize if return value is used.<br>+    FunctionType *FT = Callee->getFunctionType();<br>+    if (FT->getNumParams() != 2 || !FT->getParamType(0)->isPointerTy() ||<br>+        !FT->getParamType(1)->isPointerTy() ||<br>+        !CI->use_empty())<br>+      return 0;<br>+<br>+    // fputs(s,F) --> fwrite(s,1,strlen(s),F)<br>+    uint64_t Len = GetStringLength(CI->getArgOperand(0));<br>+    if (!Len) return 0;<br>+    // Known to have no uses (see above).<br>+    return EmitFWrite(CI->getArgOperand(0),<br>+                      ConstantInt::get(TD->getIntPtrType(*Context), Len-1),<br>+                      CI->getArgOperand(1), B, TD, TLI);<br>+  }<br>+};<br>+<br>} // End anonymous namespace.<br><br>namespace llvm {<br>@@ -1668,6 +1690,7 @@<br>  SPrintFOpt SPrintF;<br>  FPrintFOpt FPrintF;<br>  FWriteOpt FWrite;<br>+  FPutsOpt FPuts;<br><br>  void initOptimizations();<br>  void addOpt(LibFunc::Func F, LibCallOptimization* Opt);<br>@@ -1795,6 +1818,7 @@<br>  addOpt(LibFunc::sprintf, &SPrintF);<br>  addOpt(LibFunc::fprintf, &FPrintF);<br>  addOpt(LibFunc::fwrite, &FWrite);<br>+  addOpt(LibFunc::fputs, &FPuts);<br>}<br><br>Value *LibCallSimplifierImpl::optimizeCall(CallInst *CI) {<br><br>Modified: llvm/trunk/test/Transforms/InstCombine/fprintf-1.ll<br>URL:<span class="Apple-converted-space"> </span><a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/fprintf-1.ll?rev=168893&r1=168892&r2=168893&view=diff">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/fprintf-1.ll?rev=168893&r1=168892&r2=168893&view=diff</a><br>==============================================================================<br>--- llvm/trunk/test/Transforms/InstCombine/fprintf-1.ll (original)<br>+++ llvm/trunk/test/Transforms/InstCombine/fprintf-1.ll Thu Nov 29 09:45:43 2012<br>@@ -38,13 +38,14 @@<br>}<br><br>; Check fprintf(fp, "%s", str) -> fputs(str, fp).<br>+; NOTE: The fputs simplifier simplifies this further to fwrite.<br><br>define void @test_simplify3(%FILE* %fp) {<br>; CHECK: @test_simplify3<br>  %fmt = getelementptr [3 x i8]* @percent_s, i32 0, i32 0<br>  %str = getelementptr [13 x i8]* @hello_world, i32 0, i32 0<br>  call i32 (%FILE*, i8*, ...)* @fprintf(%FILE* %fp, i8* %fmt, i8* %str)<br>-; CHECK-NEXT: call i32 @fputs(i8* getelementptr inbounds ([13 x i8]* @hello_world, i32 0, i32 0), %FILE* %fp)<br>+; CHECK-NEXT: call i32 @fwrite(i8* getelementptr inbounds ([13 x i8]* @hello_world, i32 0, i32 0), i32 12, i32 1, %FILE* %fp)<br>  ret void<br>; CHECK-NEXT: ret void<br>}<br><br>Added: llvm/trunk/test/Transforms/InstCombine/fputs-1.ll<br>URL:<span class="Apple-converted-space"> </span><a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/fputs-1.ll?rev=168893&view=auto">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/fputs-1.ll?rev=168893&view=auto</a><br>==============================================================================<br>--- llvm/trunk/test/Transforms/InstCombine/fputs-1.ll (added)<br>+++ llvm/trunk/test/Transforms/InstCombine/fputs-1.ll Thu Nov 29 09:45:43 2012<br>@@ -0,0 +1,43 @@<br>+; Test that the fputs library call simplifier works correctly.<br>+;<br>+; RUN: opt < %s -instcombine -S | FileCheck %s<br>+<br>+target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128"<br>+<br>+%FILE = type { }<br>+<br>+@empty = constant [1 x i8] zeroinitializer<br>+@A = constant [2 x i8] c"A\00"<br>+@hello = constant [7 x i8] c"hello\0A\00"<br>+<br>+declare i32 @fputs(i8*, %FILE*)<br>+<br>+; Check fputs(str, fp) --> fwrite(str, 1, strlen(s), fp).<br>+<br>+define void @test_simplify1(%FILE* %fp) {<br>+; CHECK: @test_simplify1<br>+  %str = getelementptr [1 x i8]* @empty, i32 0, i32 0<br>+  call i32 @fputs(i8* %str, %FILE* %fp)<br>+  ret void<br>+; CHECK-NEXT: ret void<br>+}<br>+<br>+; NOTE: The fwrite simplifier simplifies this further to fputc.<br>+<br>+define void @test_simplify2(%FILE* %fp) {<br>+; CHECK: @test_simplify2<br>+  %str = getelementptr [2 x i8]* @A, i32 0, i32 0<br>+  call i32 @fputs(i8* %str, %FILE* %fp)<br>+; CHECK-NEXT: call i32 @fputc(i32 65, %FILE* %fp)<br>+  ret void<br>+; CHECK-NEXT: ret void<br>+}<br>+<br>+define void @test_simplify3(%FILE* %fp) {<br>+; CHECK: @test_simplify3<br>+  %str = getelementptr [7 x i8]* @hello, i32 0, i32 0<br>+  call i32 @fputs(i8* %str, %FILE* %fp)<br>+; CHECK-NEXT: call i32 @fwrite(i8* getelementptr inbounds ([7 x i8]* @hello, i32 0, i32 0), i32 6, i32 1, %FILE* %fp)<br>+  ret void<br>+; CHECK-NEXT: ret void<br>+}<br><br>Removed: llvm/trunk/test/Transforms/SimplifyLibCalls/FPuts.ll<br>URL:<span class="Apple-converted-space"> </span><a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SimplifyLibCalls/FPuts.ll?rev=168892&view=auto">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SimplifyLibCalls/FPuts.ll?rev=168892&view=auto</a><br>==============================================================================<br>--- llvm/trunk/test/Transforms/SimplifyLibCalls/FPuts.ll (original)<br>+++ llvm/trunk/test/Transforms/SimplifyLibCalls/FPuts.ll (removed)<br>@@ -1,30 +0,0 @@<br>-; Test that the FPutsOptimizer works correctly<br>-; RUN: opt < %s -simplify-libcalls -S | FileCheck %s<br>-<br>-; This transformation requires the pointer size, as it assumes that size_t is<br>-; the size of a pointer.<br>-target datalayout = "p:64:64:64"<br>-<br>-<span class="Apple-tab-span" style="white-space: pre;">     </span>%struct._IO_FILE = type { i32, i8*, i8*, i8*, i8*, i8*, i8*, i8*, i8*, i8*, i8*, i8*, %struct._IO_marker*, %struct._IO_FILE*, i32, i32, i32, i16, i8, [1 x i8], i8*, i64, i8*, i8*, i32, [52 x i8] }<br>-<span class="Apple-tab-span" style="white-space: pre;">   </span>%struct._IO_marker = type { %struct._IO_marker*, %struct._IO_FILE*, i32 }<br>-@stdout = external global %struct._IO_FILE*<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span>; <%struct._IO_FILE**> [#uses=1]<br>-@empty = constant [1 x i8] zeroinitializer<span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span>; <[1 x i8]*> [#uses=1]<br>-@len1 = constant [2 x i8] c"A\00"<span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span>; <[2 x i8]*> [#uses=1]<br>-@long = constant [7 x i8] c"hello\0A\00"<span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span>; <[7 x i8]*> [#uses=1]<br>-<br>-declare i32 @fputs(i8*, %struct._IO_FILE*)<br>-<br>-define i32 @main() {<br>-; CHECK: define i32 @main() {<br>-entry:<br>-<span class="Apple-tab-span" style="white-space: pre;">       </span>%out = load %struct._IO_FILE** @stdout<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span>; <%struct._IO_FILE*> [#uses=3]<br>-<span class="Apple-tab-span" style="white-space: pre;">  </span>%s1 = getelementptr [1 x i8]* @empty, i32 0, i32 0<span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span>; <i8*> [#uses=1]<br>-<span class="Apple-tab-span" style="white-space: pre;">        </span>%s2 = getelementptr [2 x i8]* @len1, i32 0, i32 0<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span>; <i8*> [#uses=1]<br>-<span class="Apple-tab-span" style="white-space: pre;">        </span>%s3 = getelementptr [7 x i8]* @long, i32 0, i32 0<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span>; <i8*> [#uses=1]<br>-<span class="Apple-tab-span" style="white-space: pre;">        </span>%a = call i32 @fputs( i8* %s1, %struct._IO_FILE* %out )<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span>; <i32> [#uses=0]<br>-<span class="Apple-tab-span" style="white-space: pre;">        </span>%b = call i32 @fputs( i8* %s2, %struct._IO_FILE* %out )<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span>; <i32> [#uses=0]<br>-<span class="Apple-tab-span" style="white-space: pre;">        </span>%c = call i32 @fputs( i8* %s3, %struct._IO_FILE* %out )<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span>; <i32> [#uses=0]<br>-<span class="Apple-tab-span" style="white-space: pre;">        </span>ret i32 0<br>-<br>-; CHECK-NOT: @fputs(<br>-}<br><br><br>_______________________________________________<br>llvm-commits mailing list<br><a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a></div></blockquote></div><br></div></body></html>