[llvm-commits] [PATCH] optsize attribute on top of -Oz gives bigger code
Chandler Carruth
chandlerc at google.com
Fri May 11 19:07:46 PDT 2012
On Thu, May 10, 2012 at 7:47 AM, Patrik Hägglund H <
patrik.h.hagglund at ericsson.com> wrote:
> **
> OK. I checked this a second time, and found that it wasn't that hard to
> add support for opt -Os/-Oz, to mirror the clang driver.
>
> The attached patch contains
> * my previous patch for the inliner,
> * the changes to the opt driver, and
> * a testcase to test the inliner.
>
> The opt -Os and -Oz changes has been tested to compile and execute our own
> (quite small) collection of C programs.
>
Very cool, the test case is nice. =]
A couple of requests:
1) Can you separate out the opt tool change and commit that separately? I
think that makes for a cleaner set of changes.
2) It would be nice to get a very boring test case that just tries to use
-Os and -Oz in the 'opt' tool. Ideally, we would ask the opt tool to dump
the pass manager information and FileCheck over it.
Also, some comments on the patches themselves:
diff --git a/tools/opt/opt.cpp b/tools/opt/opt.cpp
index a5b0511..c879d4b 100644
--- a/tools/opt/opt.cpp
+++ b/tools/opt/opt.cpp
@@ -104,15 +104,23 @@ StandardLinkOpts("std-link-opts",
static cl::opt<bool>
OptLevelO1("O1",
- cl::desc("Optimization level 1. Similar to llvm-gcc -O1"));
+ cl::desc("Optimization level 1. Similar to clang -O1"));
static cl::opt<bool>
OptLevelO2("O2",
- cl::desc("Optimization level 2. Similar to llvm-gcc -O2"));
+ cl::desc("Optimization level 2. Similar to clang -O2"));
static cl::opt<bool>
OptLevelO3("O3",
- cl::desc("Optimization level 3. Similar to llvm-gcc -O3"));
+ cl::desc("Optimization level 3. Similar to clang -O3"));
+
+static cl::opt<bool>
+OptLevelOs("Os",
+ cl::desc("Like -O2 with extra optimizations for size. Similar
to clang -Os"));
+
+static cl::opt<bool>
+OptLevelOz("Oz",
+ cl::desc("Like -Os but reduces code size further. Similar to
clang -Oz"));
static cl::opt<std::string>
TargetTriple("mtriple", cl::desc("Override target triple for module"));
@@ -409,16 +417,21 @@ static inline void addPass(PassManagerBase &PM, Pass
*P) {
///
/// OptLevel - Optimization Level
static void AddOptimizationPasses(PassManagerBase &MPM,FunctionPassManager
&FPM,
- unsigned OptLevel) {
+ unsigned OptLevel, unsigned SizeLevel) {
FPM.add(createVerifierPass()); // Verify that input is
correct
PassManagerBuilder Builder;
Builder.OptLevel = OptLevel;
+ Builder.SizeLevel = SizeLevel;
if (DisableInline) {
// No inlining pass
} else if (OptLevel > 1) {
unsigned Threshold = 225;
+ if (SizeLevel == 1) // -Os
+ Threshold = 75;
+ else if (SizeLevel == 2) // -Oz
+ Threshold = 25;
if (OptLevel > 2)
Threshold = 275;
Builder.Inliner = createFunctionInliningPass(Threshold);
@@ -571,7 +584,7 @@ int main(int argc, char **argv) {
Passes.add(TD);
OwningPtr<FunctionPassManager> FPasses;
- if (OptLevelO1 || OptLevelO2 || OptLevelO3) {
+ if (OptLevelO1 || OptLevelO2 || OptLevelO3 || OptLevelOs || OptLevelOz) {
FPasses.reset(new FunctionPassManager(M.get()));
if (TD)
FPasses->add(new TargetData(*TD));
@@ -617,20 +630,30 @@ int main(int argc, char **argv) {
}
if (OptLevelO1 && OptLevelO1.getPosition() < PassList.getPosition(i)) {
- AddOptimizationPasses(Passes, *FPasses, 1);
+ AddOptimizationPasses(Passes, *FPasses, 1, 0);
OptLevelO1 = false;
}
if (OptLevelO2 && OptLevelO2.getPosition() < PassList.getPosition(i)) {
- AddOptimizationPasses(Passes, *FPasses, 2);
+ AddOptimizationPasses(Passes, *FPasses, 2, 0);
OptLevelO2 = false;
}
if (OptLevelO3 && OptLevelO3.getPosition() < PassList.getPosition(i)) {
- AddOptimizationPasses(Passes, *FPasses, 3);
+ AddOptimizationPasses(Passes, *FPasses, 3, 0);
OptLevelO3 = false;
}
+ if (OptLevelOs && OptLevelOs.getPosition() < PassList.getPosition(i)) {
+ AddOptimizationPasses(Passes, *FPasses, 2, 1);
+ OptLevelOs = false;
+ }
+
+ if (OptLevelOz && OptLevelOz.getPosition() < PassList.getPosition(i)) {
+ AddOptimizationPasses(Passes, *FPasses, 2, 2);
+ OptLevelOz = false;
+ }
+
const PassInfo *PassInf = PassList[i];
Pass *P = 0;
if (PassInf->getNormalCtor())
@@ -682,15 +705,21 @@ int main(int argc, char **argv) {
}
if (OptLevelO1)
- AddOptimizationPasses(Passes, *FPasses, 1);
+ AddOptimizationPasses(Passes, *FPasses, 1, 0);
if (OptLevelO2)
- AddOptimizationPasses(Passes, *FPasses, 2);
+ AddOptimizationPasses(Passes, *FPasses, 2, 0);
if (OptLevelO3)
- AddOptimizationPasses(Passes, *FPasses, 3);
+ AddOptimizationPasses(Passes, *FPasses, 3, 0);
+
+ if (OptLevelOs)
+ AddOptimizationPasses(Passes, *FPasses, 2, 1);
+
+ if (OptLevelOz)
+ AddOptimizationPasses(Passes, *FPasses, 2, 2);
This... doesn't seem right. For O1, O2, and O3, I think the idea is that
each is a superset of the previous, and so adding them in sequence makes
sense.... but I thought that Os and Oz actually ran fewer optimizations
passes than O2, an the certainly run fewer than O3 don't they?
I wonder if this could be caught with the test I mentioned? Not sure, it
might be hard to interpret the tea leaves of the opt pass debugging output.
On the original patch to fix things, a few very minor nits on the test
case. I think this one is essentially good to go in once these are
addressed (and the opt patch landed naturally):
diff --git a/test/Transforms/Inline/inline-optsize.ll
b/test/Transforms/Inline/inline-optsize.ll
new file mode 100644
index 0000000..1455ea2
--- /dev/null
+++ b/test/Transforms/Inline/inline-optsize.ll
@@ -0,0 +1,35 @@
+; RUN: opt -S -Oz %s | FileCheck %s
+; xxx: clang -cc1 -S -emit-llvm -o - -Oz %s | FileCheck %s
No need for the clang line here...
+
+; The inline threshold for a function with the optsize attribute is
currently
+; the same as the global inline threshold for -Os. Check that the optsize
+; function attribute don't alter the function specific inline threshold if
the
+; global inline threshold is lower (as for -Oz).
+
+ at a = global i32 4
+
+; This function should be larger than the inline threshold for -Oz (25),
but
+; smaller than the inline threshold for optsize (75). It should not be
inlined
+; below.
It's important to make sure that we don't regress in the other direction
here -- we might decide to never inline this function regardless of the
threshold, and then this test might pass for the wrong reason.
My suggestion would be to run opt and filecheck twice. Once just as you
have it, and once with a different --check-prefix and -O2 or something to
assert that it does get inlined in that case.
Thanks!
-Chandler
+define i32 @inner() {
+ %a1 = load volatile i32* @a
+ %x1 = add i32 %a1, %a1
+ %a2 = load volatile i32* @a
+ %x2 = add i32 %x1, %a2
+ %a3 = load volatile i32* @a
+ %x3 = add i32 %x2, %a3
+ %a4 = load volatile i32* @a
+ %x4 = add i32 %x3, %a4
+ %a5 = load volatile i32* @a
+ %x5 = add i32 %x3, %a5
+ ret i32 %x5
+}
+
+define i32 @outer() optsize {
+; CHECK: @outer
+; CHECK: call
+; CHECK: ret
+
+ %r = call i32 @inner()
+ ret i32 %r
+}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120511/86cad807/attachment.html>
More information about the llvm-commits
mailing list