[PATCH] D19659: [X86] Enable RRL part of the LEA optimization pass for -O2
Andrey Turetskiy via llvm-commits
llvm-commits at lists.llvm.org
Wed May 4 08:43:40 PDT 2016
aturetsk added a comment.
The RRL part of the LEA pass takes a sane amount of compile time.
Here are the measurements.
-Os, the LEA pass is completely disabled:
real 0m57.797s
user 0m57.448s
sys 0m0.337s
-Os, only the RRL part of the LEA pass is enabled:
real 1m3.238s
user 1m2.868s
sys 0m0.352s
-Os, the LEA pass is fully enabled:
real 1m12.568s
user 1m12.193s
sys 0m0.354s
The test was generated by the script:
$ python gen.py 5000 > test.c
$ cat gen.py
import sys
def foo(n):
print 'struct { int a, b, c; } arr[1000000];'
print ''
print 'int foo(int x) {'
print ' int r = 0;'
for i in range(n):
print ' r += arr[x + %d].a + arr[x + %d].b + arr[x + %d].c;' % (i, i, i);
print ' switch (r) {'
print ' case 1:'
for i in range(n):
print ' arr[x + %d].b = 111;' % (i);
print ' arr[x + %d].c = 111;' % (i);
print ' break;'
print ' case 2:'
for i in range(n):
print ' arr[x + %d].b = 222;' % (i);
print ' arr[x + %d].c = 222;' % (i);
print ' break;'
print ' default:'
for i in range(n):
# Make the LEAs irreplaceable, so that no LEAs would be removed by the LEA
# pass and thus there would be no compile-time improvement because of the
# reduced number of instructions which need to be processed by the
# compiler in other passes
print ' arr[x + %d].b = (int) &arr[x + %d].b;' % (i, i);
print ' arr[x + %d].c = (int) &arr[x + %d].c;' % (i, i);
print ' break;'
print ' }'
print ' return r;'
print '}'
if __name__ == '__main__':
foo(int(sys.argv[1]))
The run command:
time ./bin/clang -Os -S test.c
http://reviews.llvm.org/D19659
More information about the llvm-commits
mailing list