[cfe-dev] One little test, comparing clang and gcc
Daniel Dunbar
daniel at zuster.org
Thu Oct 29 12:17:02 PDT 2009
Hi Giangiacomo,
On Thu, Oct 29, 2009 at 10:19 AM, Giangiacomo Mariotti
<gg.mariotti at gmail.com> wrote:
> Hi all,
> I just wanted to try clang, so I compiled the current trunk.
>
> clang version 1.1 (trunk 85507)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
>
> Then I compiled with it and gcc-4.4.2 the appended code with:
> 1) $(CC) -Wall -O0
> 2) $(CC) -Wall -O2
> 3) $(CC) -Wall -O3
>
> No errors, good.
> ls -l:
> 27713 2009-10-29 17:48 time_tests_clang
> 19346 2009-10-29 17:49 time_tests_clang_02
> 19346 2009-10-29 17:49 time_tests_clang_03
> 23329 2009-10-29 17:48 time_tests_gcc
> 19039 2009-10-29 17:48 time_tests_gcc_02
> 19039 2009-10-29 17:49 time_tests_gcc_03
>
> strip --strip-all
> ls -l:
> 23512 2009-10-29 17:53 time_tests_clang
> 15280 2009-10-29 17:53 time_tests_clang_02
> 15280 2009-10-29 17:53 time_tests_clang_03
> 19128 2009-10-29 17:52 time_tests_gcc
> 15008 2009-10-29 17:52 time_tests_gcc_02
> 15008 2009-10-29 17:52 time_tests_gcc_03
>
> A bit bloated the unoptimized version, but the optimized version is good.
Yes, clang's -O0 codegen is known to be more verbose than gcc. We'd
like to fix it, but it isn't a very high priority. Also as you
noticed, there isn't much difference between -O2 and -O3 for clang.
> These are the sha1sums of the stripped files:
>
> c99fed6779e4b6078b8a9eca80c64a31e741ac61 time_tests_clang
> e162a06e483d14d6baa64d2bfd519b8d2e1e0d0c time_tests_clang_02
> e162a06e483d14d6baa64d2bfd519b8d2e1e0d0c time_tests_clang_03
> 62579489108af5fc4bb4d4d57d4d8e5a1537b4ee time_tests_gcc
> ee582023b6acbbbb5235e674534d5949bb18fb0f time_tests_gcc_02
> 5f83c4e4474001632590f5689e9eea84c537145a time_tests_gcc_03
>
> The -O3 switch isn't really implemented or the new optimizations
> activated simply didn't find anything to modify on this code?
It's implemented, it just didn't do anything different.
> This is the code (taken from Programming Pearls 2nd ed., by Jon Bentley):
> (with some irrelevant modifications by me)
>
>
> /* Copyright (C) 1999 Lucent Technologies */
> /* From 'Programming Pearls' by Jon Bentley */
>
> /* timemod.c -- Produce table of C run time costs */
>
> #include <stdio.h>
> #include <time.h>
> #include <stdlib.h>
> #include <math.h>
>
> #define MAXN 100000
> int x[MAXN];
> int startn = 5000;
> int n;
>
> /* FUNCTIONS TO BE TIMED */
>
> int intcmp(int *i, int *j)
> {
> return *i - *j;
> }
>
> #define swapmac(i, j) { t = x[i]; x[i] = x[j]; x[j] = t; }
>
> void swapfunc(int i, int j)
> {
> int t = x[i];
> x[i] = x[j];
> x[j] = t;
> }
>
> static inline void swap_func_inline(int arr[], int i, int j)
> {
> int t = arr[i];
> arr[i] = arr[j];
> arr[j] = t;
> }
>
> #define maxmac(a, b) ((a) > (b) ? (a) : (b))
>
> int maxfunc(int a, int b)
> {
> return a > b ? a : b;
> }
>
>
> /* WORKHORSE */
>
> #define T(s, n)\
> do {\
> typeof(s) s__ = s;\
> typeof(n) n__ = n;\
> printf("%s (n=%d)\n", s__, n__);\
> } while (0)
>
> #define TRIALS 5
>
> #define M(op) do { \
> printf(" %-26s", #op); \
> k = 0; \
> timesum = 0; \
> for (ex = 0; ex < TRIALS; ex++) { \
> start = clock(); \
> for (i = 1; i <= n; i++) { \
> fi = (float) i; \
> for (j = 1; j <= n; j++) { \
> op; \
> } \
> } \
> t = clock()-start; \
> printf("%8d", t); \
> timesum += t; \
> } \
> nans = 1e9 * timesum / ((double) \
> n*n * TRIALS * CLOCKS_PER_SEC); \
> printf("%8.0f\n", nans); } while (0)
>
>
> int main()
> {
> int i, j, k;
> float fi, fj, fk;
> int t, ex, timesum, start, globalstart;
> double nans;
>
> globalstart = clock();
> for (i = 0; i < n; i++)
> x[i] = rand();
> n = startn;
> printf("C Time Cost Model, n=%d\n", n);
> T("\nInteger Arithmetic", n);
> M({});
> M(k++);
> M(k = i);
> M(k = i + j);
> M(k = i - j);
> M(k = i * j);
> M(k = i / j);
> M(k = i % j);
> M(k = i & j);
> M(k = i | j);
> T("\nFloating Point Arithmetic", n);
> M(fj=j;);
> M(fj=j; fk = fi + fj;);
> M(fj=j; fk = fi - fj;);
> M(fj=j; fk = fi * fj;);
> M(fj=j; fk = fi / fj;);
> T("\nArray Operations", n);
> M(k = i + j);
> M(k = x[i] + j);
> M(k = i + x[j]);
> M(k = x[i] + x[j]);
> T("\nComparisons", n);
> M(if (i < j) k++);
> M(if (x[i] < x[j]) k++);
> T("\nArray Comparisons and Swaps", n);
> M(k = (x[i]<x[k]) ? -1 : 1);
> M(k = intcmp(x+i, x+j));
> M(swapmac(i, j));
> M(swapfunc(i, j));
> M(swap_func_inline(x, i, j));
> T("\nMax Function, Macro and Inline", n);
> M(k = (i > j) ? i : j);
> M(k = maxmac(i, j));
> M(k = maxfunc(i, j));
> n = startn / 5;
> T("\nMath Functions", n);
> M(fk = j+fi;);
> M(k = rand(););
> M(fk = sqrt(j+fi));
> M(fk = sin(j+fi));
> M(fk = sinh(j+fi));
> M(fk = asin(j+fi));
> M(fk = cos(j+fi));
> M(fk = tan(j+fi));
> n = startn / 10;
> T("\nMemory Allocation", n);
> M(free(malloc(16)););
> M(free(malloc(100)););
> M(free(malloc(2000)););
>
> /* Possible additions: input, output, malloc */
> printf("\n Secs: %4.2f\n",
> ((double) clock()-globalstart)
> / CLOCKS_PER_SEC);
> return 0;
> }
>
> These are the execution results:
>
> ->time_tests_gcc:
> C Time Cost Model, n=5000
>
> Integer Arithmetic (n=5000)
> {} 60000 70000 60000 70000 60000 3
> k++ 60000 60000 60000 60000 70000 2
> k = i 60000 70000 60000 70000 60000 3
> k = i + j 60000 50000 60000 60000 50000 2
> k = i - j 60000 60000 70000 60000 70000 3
> k = i * j 50000 60000 50000 60000 60000 2
> k = i / j 100000 100000 100000 110000 100000 4
> k = i % j 100000 100000 110000 100000 100000 4
> k = i & j 60000 50000 60000 60000 50000 2
> k = i | j 60000 50000 60000 60000 50000 2
>
> Floating Point Arithmetic (n=5000)
> fj=j; 70000 60000 70000 60000 70000 3
> fj=j; fk = fi + fj; 120000 120000 120000 120000 120000 5
> fj=j; fk = fi - fj; 130000 120000 120000 120000 120000 5
> fj=j; fk = fi * fj; 130000 130000 130000 130000 130000 5
> fj=j; fk = fi / fj; 230000 220000 230000 220000 220000 9
>
> Array Operations (n=5000)
> k = i + j 60000 60000 50000 60000 50000 2
> k = x[i] + j 60000 60000 50000 60000 60000 2
> k = i + x[j] 60000 60000 60000 60000 60000 2
> k = x[i] + x[j] 70000 70000 70000 70000 70000 3
>
> Comparisons (n=5000)
> if (i < j) k++ 60000 70000 60000 60000 60000 2
> if (x[i] < x[j]) k++ 70000 80000 70000 70000 80000 3
>
> Array Comparisons and Swaps (n=5000)
> k = (x[i]<x[k]) ? -1 : 1 70000 70000 70000 80000 70000 3
> k = intcmp(x+i, x+j) 160000 160000 170000 160000 160000 6
> swapmac(i, j) 110000 120000 110000 120000 110000 5
> swapfunc(i, j) 200000 200000 200000 200000 210000 8
> swap_func_inline(x, i, j) 210000 220000 220000 210000 220000 9
>
> Max Function, Macro and Inline (n=5000)
> k = (i > j) ? i : j 70000 70000 60000 70000 70000 3
> k = maxmac(i, j) 60000 70000 70000 70000 60000 3
> k = maxfunc(i, j) 140000 140000 140000 140000 140000 6
>
> Math Functions (n=1000)
> fk = j+fi; 10000 0 0 0 0 2
> k = rand(); 20000 10000 10000 10000 10000 12
> fk = sqrt(j+fi) 20000 10000 20000 20000 20000 18
> fk = sin(j+fi) 50000 60000 60000 60000 60000 58
> fk = sinh(j+fi) 30000 20000 30000 30000 30000 28
> fk = asin(j+fi) 30000 20000 30000 30000 20000 26
> fk = cos(j+fi) 60000 60000 60000 60000 60000 60
> fk = tan(j+fi) 80000 80000 90000 80000 80000 82
>
> Memory Allocation (n=500)
> free(malloc(16)); 10000 10000 10000 0 10000 32
> free(malloc(100)); 10000 10000 0 10000 10000 32
> free(malloc(2000)); 10000 20000 10000 10000 20000 56
>
> Secs: 15.40
>
> ->time_tests_clang:
> C Time Cost Model, n=5000
>
> Integer Arithmetic (n=5000)
> {} 60000 70000 60000 70000 60000 3
> k++ 60000 50000 60000 60000 50000 2
> k = i 70000 60000 70000 60000 70000 3
> k = i + j 50000 60000 60000 50000 60000 2
> k = i - j 50000 60000 60000 50000 60000 2
> k = i * j 60000 60000 60000 60000 60000 2
> k = i / j 100000 110000 100000 100000 100000 4
> k = i % j 110000 100000 100000 110000 100000 4
> k = i & j 60000 50000 60000 60000 50000 2
> k = i | j 60000 60000 50000 60000 60000 2
>
> Floating Point Arithmetic (n=5000)
> fj=j; 70000 70000 80000 70000 80000 3
> fj=j; fk = fi + fj; 120000 120000 120000 120000 130000 5
> fj=j; fk = fi - fj; 120000 120000 120000 120000 120000 5
> fj=j; fk = fi * fj; 140000 130000 130000 130000 130000 5
> fj=j; fk = fi / fj; 240000 250000 240000 240000 240000 10
>
> Array Operations (n=5000)
> k = i + j 60000 60000 50000 60000 60000 2
> k = x[i] + j 60000 70000 60000 70000 70000 3
> k = i + x[j] 60000 70000 60000 70000 70000 3
> k = x[i] + x[j] 70000 80000 70000 80000 70000 3
>
> Comparisons (n=5000)
> if (i < j) k++ 60000 70000 60000 60000 60000 2
> if (x[i] < x[j]) k++ 70000 80000 80000 70000 80000 3
>
> Array Comparisons and Swaps (n=5000)
> k = (x[i]<x[k]) ? -1 : 1 80000 90000 80000 90000 90000 3
> k = intcmp(x+i, x+j) 170000 180000 170000 180000 180000 7
> swapmac(i, j) 120000 110000 120000 110000 120000 5
> swapfunc(i, j) 190000 180000 190000 190000 190000 8
> swap_func_inline(x, i, j) 220000 210000 220000 220000 220000 9
>
> Max Function, Macro and Inline (n=5000)
> k = (i > j) ? i : j 70000 80000 80000 80000 70000 3
> k = maxmac(i, j) 80000 80000 80000 80000 80000 3
> k = maxfunc(i, j) 180000 170000 170000 170000 170000 7
>
> Math Functions (n=1000)
> fk = j+fi; 10000 0 0 0 10000 4
> k = rand(); 10000 10000 10000 10000 10000 10
> fk = sqrt(j+fi) 20000 20000 20000 20000 20000 20
> fk = sin(j+fi) 50000 60000 60000 50000 60000 56
> fk = sinh(j+fi) 30000 30000 30000 30000 30000 30
> fk = asin(j+fi) 30000 20000 30000 30000 30000 28
> fk = cos(j+fi) 60000 50000 60000 60000 60000 58
> fk = tan(j+fi) 80000 80000 80000 90000 80000 82
>
> Memory Allocation (n=500)
> free(malloc(16)); 10000 0 10000 10000 10000 32
> free(malloc(100)); 0 10000 10000 10000 0 24
> free(malloc(2000)); 20000 10000 10000 20000 10000 56
>
> Secs: 15.99
>
> ->time_tests_gcc_02:
> C Time Cost Model, n=5000
>
> Integer Arithmetic (n=5000)
> {} 0 0 0 0 0 0
> k++ 0 0 0 0 0 0
> k = i 0 0 0 0 0 0
> k = i + j 0 0 0 0 0 0
> k = i - j 0 0 0 0 0 0
> k = i * j 0 0 0 0 0 0
> k = i / j 0 0 0 0 0 0
> k = i % j 0 0 0 0 0 0
> k = i & j 0 0 0 0 0 0
> k = i | j 0 0 0 0 0 0
>
> Floating Point Arithmetic (n=5000)
> fj=j; 0 0 0 0 0 0
> fj=j; fk = fi + fj; 0 0 0 0 0 0
> fj=j; fk = fi - fj; 0 0 0 0 0 0
> fj=j; fk = fi * fj; 0 0 0 0 0 0
> fj=j; fk = fi / fj; 0 0 0 0 0 0
>
> Array Operations (n=5000)
> k = i + j 0 0 0 0 0 0
> k = x[i] + j 0 0 0 0 0 0
> k = i + x[j] 0 0 0 0 0 0
> k = x[i] + x[j] 0 0 0 0 0 0
>
> Comparisons (n=5000)
> if (i < j) k++ 0 0 0 0 0 0
> if (x[i] < x[j]) k++ 0 0 0 0 0 0
>
> Array Comparisons and Swaps (n=5000)
> k = (x[i]<x[k]) ? -1 : 1 0 0 0 0 0 0
> k = intcmp(x+i, x+j) 0 0 0 0 0 0
> swapmac(i, j) 30000 30000 30000 20000 30000 1
> swapfunc(i, j) 30000 30000 30000 30000 20000 1
> swap_func_inline(x, i, j) 40000 30000 40000 30000 40000 1
>
> Max Function, Macro and Inline (n=5000)
> k = (i > j) ? i : j 0 0 0 0 0 0
> k = maxmac(i, j) 0 0 0 0 0 0
> k = maxfunc(i, j) 0 0 0 0 0 0
>
> Math Functions (n=1000)
> fk = j+fi; 0 0 0 0 0 0
> k = rand(); 10000 10000 10000 10000 10000 10
> fk = sqrt(j+fi) 0 10000 0 0 0 2
> fk = sin(j+fi) 0 0 0 0 0 0
> fk = sinh(j+fi) 20000 20000 20000 20000 20000 20
> fk = asin(j+fi) 20000 20000 20000 20000 20000 20
> fk = cos(j+fi) 0 0 0 0 0 0
> fk = tan(j+fi) 0 0 0 0 0 0
>
> Memory Allocation (n=500)
> free(malloc(16)); 10000 10000 0 10000 10000 32
> free(malloc(100)); 0 10000 10000 10000 10000 32
> free(malloc(2000)); 10000 10000 20000 10000 10000 48
>
> Secs: 0.86
>
> ->time_tests_clang_02
> C Time Cost Model, n=5000
>
> Integer Arithmetic (n=5000)
> {} 10000 20000 20000 20000 20000 1
> k++ 20000 20000 20000 20000 20000 1
> k = i 10000 20000 20000 20000 20000 1
> k = i + j 20000 20000 20000 20000 10000 1
> k = i - j 20000 20000 20000 20000 20000 1
> k = i * j 20000 20000 20000 10000 20000 1
> k = i / j 20000 20000 20000 20000 20000 1
> k = i % j 20000 20000 20000 10000 20000 1
> k = i & j 20000 20000 20000 20000 20000 1
> k = i | j 20000 20000 10000 20000 20000 1
>
> Floating Point Arithmetic (n=5000)
> fj=j; 20000 20000 20000 20000 20000 1
> fj=j; fk = fi + fj; 20000 20000 20000 20000 20000 1
> fj=j; fk = fi - fj; 10000 20000 20000 20000 20000 1
> fj=j; fk = fi * fj; 20000 20000 20000 20000 20000 1
> fj=j; fk = fi / fj; 20000 20000 20000 20000 20000 1
>
> Array Operations (n=5000)
> k = i + j 20000 20000 10000 20000 20000 1
> k = x[i] + j 20000 20000 20000 20000 20000 1
> k = i + x[j] 20000 20000 20000 20000 20000 1
> k = x[i] + x[j] 20000 10000 20000 20000 20000 1
>
> Comparisons (n=5000)
> if (i < j) k++ 20000 20000 20000 20000 20000 1
> if (x[i] < x[j]) k++ 20000 20000 20000 20000 10000 1
>
> Array Comparisons and Swaps (n=5000)
> k = (x[i]<x[k]) ? -1 : 1 20000 20000 20000 20000 20000 1
> k = intcmp(x+i, x+j) 20000 20000 20000 20000 10000 1
> swapmac(i, j) 30000 30000 30000 30000 30000 1
> swapfunc(i, j) 30000 20000 30000 30000 30000 1
> swap_func_inline(x, i, j) 30000 30000 30000 30000 20000 1
>
> Max Function, Macro and Inline (n=5000)
> k = (i > j) ? i : j 20000 20000 20000 20000 20000 1
> k = maxmac(i, j) 20000 20000 10000 20000 20000 1
> k = maxfunc(i, j) 20000 20000 20000 20000 20000 1
>
> Math Functions (n=1000)
> fk = j+fi; 0 0 0 0 0 0
> k = rand(); 10000 10000 10000 10000 10000 10
> fk = sqrt(j+fi) 20000 20000 20000 10000 20000 18
> fk = sin(j+fi) 60000 50000 60000 60000 50000 56
> fk = sinh(j+fi) 30000 30000 30000 20000 30000 28
> fk = asin(j+fi) 30000 30000 20000 30000 30000 28
> fk = cos(j+fi) 60000 50000 60000 60000 50000 56
> fk = tan(j+fi) 80000 80000 90000 80000 80000 82
>
> Memory Allocation (n=500)
> free(malloc(16)); 0 0 0 0 0 0
> free(malloc(100)); 0 0 0 0 0 0
> free(malloc(2000)); 0 0 0 0 0 0
>
> Secs: 4.30
>
> ->time_tests_gcc_03:
> C Time Cost Model, n=5000
>
> Integer Arithmetic (n=5000)
> {} 0 0 0 0 0 0
> k++ 0 0 0 0 0 0
> k = i 0 0 0 0 0 0
> k = i + j 0 0 0 0 0 0
> k = i - j 0 0 0 0 0 0
> k = i * j 0 0 0 0 0 0
> k = i / j 0 0 0 0 0 0
> k = i % j 0 0 0 0 0 0
> k = i & j 0 0 0 0 0 0
> k = i | j 0 0 0 0 0 0
>
> Floating Point Arithmetic (n=5000)
> fj=j; 0 0 0 0 0 0
> fj=j; fk = fi + fj; 0 0 0 0 0 0
> fj=j; fk = fi - fj; 0 0 0 0 0 0
> fj=j; fk = fi * fj; 0 0 0 0 0 0
> fj=j; fk = fi / fj; 0 0 0 0 0 0
>
> Array Operations (n=5000)
> k = i + j 0 0 0 0 0 0
> k = x[i] + j 0 0 0 0 0 0
> k = i + x[j] 0 0 0 0 0 0
> k = x[i] + x[j] 0 0 0 0 0 0
>
> Comparisons (n=5000)
> if (i < j) k++ 0 0 0 0 0 0
> if (x[i] < x[j]) k++ 0 0 0 0 0 0
>
> Array Comparisons and Swaps (n=5000)
> k = (x[i]<x[k]) ? -1 : 1 0 0 0 0 0 0
> k = intcmp(x+i, x+j) 0 0 0 0 0 0
> swapmac(i, j) 30000 30000 30000 20000 30000 1
> swapfunc(i, j) 30000 30000 30000 30000 30000 1
> swap_func_inline(x, i, j) 30000 40000 30000 40000 30000 1
>
> Max Function, Macro and Inline (n=5000)
> k = (i > j) ? i : j 0 0 0 0 0 0
> k = maxmac(i, j) 0 0 0 0 0 0
> k = maxfunc(i, j) 0 0 0 0 0 0
>
> Math Functions (n=1000)
> fk = j+fi; 0 0 0 0 0 0
> k = rand(); 10000 10000 10000 10000 10000 10
> fk = sqrt(j+fi) 0 10000 0 0 0 2
> fk = sin(j+fi) 0 0 0 0 0 0
> fk = sinh(j+fi) 20000 20000 20000 20000 20000 20
> fk = asin(j+fi) 20000 20000 20000 20000 20000 20
> fk = cos(j+fi) 0 0 0 0 0 0
> fk = tan(j+fi) 0 0 0 0 0 0
>
> Memory Allocation (n=500)
> free(malloc(16)); 10000 0 10000 10000 10000 32
> free(malloc(100)); 0 10000 10000 10000 0 24
> free(malloc(2000)); 20000 10000 20000 10000 10000 56
>
> Secs: 0.86
>
> time_tests_clang_02 and time_tests_clang_03 are the same, so testing
> it again is irrelevant.
>
> Disregarding the difference between time_tests_gcc_02 and
> time_tests_gcc_03, which doesn't seems relevant, is the difference
> between time_tests_gcc_02 and time_tests_clang_02 something expected?
>
> I'm talking about what gcc's optimizations seems to have removed while
> clang seems to have maintained. Is that a design choice or something
> that simply still needs work?
I don't know the answers to these questions without digging in to your
example -- there is a lot of information here which I find hard to
read. I think you are asking about the time difference between
clang_02 and gcc_02, this is probably due to some missed optimization
which of course we always would like to fix. However, you'll have to
actually dig in to the example to find out what is going on. If you
can identify a small test case where clang clearly isn't optimizing
something it should, please file a bugzilla for it
(http://llvm.org/bugs).
- Daniel
> Giangiacomo Mariotti
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
More information about the cfe-dev
mailing list