[llvm-commits] [commit-after-approval] printf("foo %s baz\n", "bar") -> puts("foo bar baz")

Nick Lewycky nicholas at mxc.ca
Wed Aug 5 08:17:09 PDT 2009


Ryan Flynn wrote:
> given a printf() call format containing a single "%s" conversion
> specifier and a constant string parameter, inlines parameter into
> format and lowers printf() to puts().
> 
> it's a bit more complex than the other printf() optimizations, but i
> happened to run into the situation and it seemed pretty common and
> straightforward to implement.
> 
> my first attempt at an optimization, constructive criticism appreciated.

Index: lib/Transforms/Scalar/SimplifyLibCalls.cpp
===================================================================
--- lib/Transforms/Scalar/SimplifyLibCalls.cpp	(revision 77970)
+++ lib/Transforms/Scalar/SimplifyLibCalls.cpp	(working copy)
@@ -1323,6 +1323,32 @@
        EmitPutS(CI->getOperand(2), B);
        return CI;
      }
+
+    // printf("foo %s baz\n", "bar") -> puts("foo bar baz")
+    // TODO: eliminate PutS and '\n' suffix dependency, if possible
+    // TODO: implement for multiple string parameters

I think you're conflating two transformations here. You should be able 
to fold "foo %s baz" with a constant parameter "bar" into "foo bar baz". 
This could apply iteratively to multiple %s parts. Once that's done, 
there's another transformation which turns printf("foo bar baz\n") into 
puts("foo bar baz"), so there's no need to check for the \n here, just 
emit a printf at the end instead.

+    size_t s;
+    if (CI->getNumOperands() == 3 &&
+        CI->use_empty() &&
+        isa<PointerType>(CI->getOperand(2)->getType()) &&
+        FormatStr[FormatStr.size()-1] == '\n' &&
+        (s = FormatStr.find('%')) != std::string::npos &&
+        FormatStr[s+1] == 's' &&                          // "%s"
+        FormatStr.find('%', s+1) == std::string::npos) {  // no others

I'm pretty sure that the "FormatStr[s+1] == 's'" part will dereference 
past the end of the string given a string like "foo %".

Maybe you can break this if-statement up into a few pieces? It looks 
horrible to have it all as one huge expression, especially when it 
assigns to 'size_t s' somewhere in the middle.

+      std::string Arg;
+      if (GetConstantStringInfo(CI->getOperand(2), Arg)) {
+        // inline arg into format string, replace "%s" and '\n'
+        std::string Merge(FormatStr, 0, s);
+        Merge.reserve(FormatStr.size() + Arg.size());
+        Merge.append(Arg);
+        Merge.append(FormatStr, s+2, FormatStr.size()-(s+2)-1);
+        Constant *C = ConstantArray::get(Merge, true);
+        C = new GlobalVariable(*Callee->getParent(), C->getType(), true,
+                              GlobalVariable::InternalLinkage, C, "str");

Please use PrivateLinkage here, not InternalLinkage. Current LLVM uses 
private for string literals.

Nick

+        EmitPutS(C, B);
+        return CI;
+      }
+    }
      return 0;
    }
  };
Index: test/Transforms/SimplifyLibCalls/Printf.ll
===================================================================
--- test/Transforms/SimplifyLibCalls/Printf.ll	(revision 77970)
+++ test/Transforms/SimplifyLibCalls/Printf.ll	(working copy)
@@ -3,7 +3,8 @@
  ; RUN:   not grep {call.*printf}

  @str = internal constant [13 x i8] c"hello world\0A\00"         ; <[13 
x i8]*> [#uses=1]
- at str1 = internal constant [2 x i8] c"h\00"              ; <[2 x i8]*> 
[#uses=1]
+ at str1 = internal constant [2 x i8] c"h\00"              ; <[2 x i8]*> 
[#uses=2]
+ at str2 = internal constant [6 x i8] c"<%s>\0A\00"      ; <[6 x i8]*> 
[#uses=1]

  define void @foo() {
  entry:
@@ -19,3 +20,9 @@
          ret void
  }

+define void @baz() {
+entry:
+        %tmp1 = tail call i32 (i8*, ...)* @printf(i8* getelementptr ([6 
x i8]* @str2, i32 0, i32 0), i8* getelementptr ([2 x i8]* @str1, i32 0, 
i32 0)) nounwind        ; <i32> [#uses=0]
+        ret void
+}
+



More information about the llvm-commits mailing list