[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status

Jack Howarth howarth at bromo.med.uc.edu
Thu Jun 9 14:16:52 PDT 2011


On Thu, Jun 09, 2011 at 03:44:40PM +0200, Duncan Sands wrote:
> Hi Jack, thanks for doing this.
>
>>      Below are the tabulated compile times and executable sizes.
>>
>> A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize
>> B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns
>> C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize
>
> These numbers really surprised me: the GCC code generators must be really slow
> if the entire set of LLVM IR and codegen optimizations takes less time to run
> than GCC codegen (since with -fplugin-arg-dragonegg-enable-gcc-optzns the only
> part of GCC being disabled is codegen, i.e. RTL).  I was assuming that I would
> need to reduce the LLVM optimization level to get decent speed.  Are you sure
> that you built GCC with checking disabled (or --enable-checking=release)?
> Can you please also redo this (along with execution times), adding the option
> -fplugin-arg-dragonegg-llvm-ir-optimize=2.  I expect that to always result in
> a decent compile time win for dragonegg wrt stock gcc-4.5.  If it doesn't have
> a significant impact on execution speed, then I'd be tempted to use the formula
>   LLVM optimization level = (1 + GCC optimization level) / 2
> as the default, i.e. GCC -O3 -> LLVM -O2, GCC -O2 -> LLVM -O1, GCC -O1 -> LLVM
> -O1, GCC -O0 -> LLVM -O0, GCC -O5 -> LLVM -O3.
>
> Best wishes, Duncan.

I get about the same thing with --enable-checking=release applied to gcc-4.5.4...

Compile time (seconds)

Benchmark     A) stock    B) gcc 4.5.4/    C) gcc 4.5.4/
               gcc 4.5.4   dragonegg/optzns    dragonegg
ac             0.86          0.44            0.31
aermod        31.13         25.81           20.94
air            1.74          1.48            0.81
capacita       0.86          0.74            0.44
channel        0.35          0.32            0.23
doduc          3.08          2.63            1.63
fatigue        1.04          1.05            0.89
gas_dyn        0.94          0.94            0.75
induct         3.30          2.52            1.84
linpk          0.33          0.28            0.20
mdbx           1.09          1.02            0.60 
nf             0.41          0.40            0.28
protein        1.56          1.28            0.98
rnflow         1.75          1.70            1.24 
test_fpu       1.38          1.41            1.05
tfft           0.31          0.28            0.19

mean           3.13          2.64            2.02

I wouldn't put a lot of faith in the compile time measurements
because unlike the actual benchmark runs, pb05 doesn't attempt to
repeat the compilations until it has converged on a low error
measurement for the compilation time.
            Jack

>
>>
>> Compile time (seconds)
>>
>> Benchmark     A) stock    B) gcc 4.5.4/    C) gcc 4.5.4/
>>                 gcc 4.5.4   dragonegg/optzns    dragonegg
>>
>> ac                0.61        1.65           0.32
>> aermod           31.24       25.83          21.02
>> air               1.74        1.49           0.81
>> capacita          0.83        0.80           0.44
>> channel           0.34        0.33           0.25
>> doduc             3.09        2.63           1.63
>> fatigue           1.04        1.08           0.84
>> gas_dyn           0.91        0.95           0.75
>> induct            3.18        2.57           1.73
>> linpk             0.34        0.30           0.21
>> mdbx              1.08        1.01           0.59
>> nf                0.39        0.41           0.28
>> protein           1.55        1.29           0.97
>> rnflow            1.76        1.73           1.26
>> test_fpu          1.38        1.40           1.05
>> tfft              0.31        0.28           0.19
>>
>> mean              3.11        2.73           2.02
>>
>> Executable size (bytes)
>>
>> Benchmark     A) stock    B) gcc 4.5.4/    C) gcc 4.5.4/
>>                 gcc 4.5.4   dragonegg/optzns    dragonegg
>>
>> ac              26344        30896           26704
>> aermod        1145924      1043816         1052056
>> air             57404        57700           53532
>> capacita        40864        41008           37064
>> channel         22448        22664           22664
>> doduc          127340       124108          120124
>> fatigue         61152        65352           65664
>> gas_dyn        647864        58768 !!!       59024
>> induct         162360       180440          175312
>> linpk           18112        18848           18864
>> mdbx            53464        57652           49516
>> nf              22560        23784           24080
>> protein         74320        74440           74816
>> rnflow          66040        71488           71648
>> test_fpu        52624        58224           58320
>> tfft            18416        18456           18600
>>
>> The compile times with optzns are 26% slower than stock dragonegg
>> but 12% faster than stock gcc 4.5.4. The most interesting executable
>> size difference is gas_dyn which fastest with optzns but 11x larger
>> in size with stock gcc 4.5.4 compared to either stock dragonegg or
>> dragonegg with optzns. This is likely much improved in gcc 4.6 with
>> the new -fwhole-file default.
>>
>> On Thu, Jun 09, 2011 at 09:51:51AM +0200, Duncan Sands wrote:
>>> Hi Jack, thanks for these numbers.  Can you also please measure compile times?
>>> I'm thinking of enabling gcc optimizations by default, but I don't want to
>>> increase compile times, which means choosing a value for the
>>> -fplugin-arg-dragonegg-llvm-ir-optimize option that is low enough to get good
>>> compile times, yet high enough to get fast code.  It would be great if you could
>>> play around with this to find a good choice.
>>>
>>> Best wishes, Duncan.
>>>
>>>>     Current dragonegg svn has all of the -fplugin-arg-dragonegg-enable-gcc-optzns bugs for
>>>> usage with -ffast-math -O3 addressed except for those related to PR2314. Using the -fno-tree-vectorize
>>>> option, we can evaluate the current state of -fplugin-arg-dragonegg-enable-gcc-optzns with
>>>> the Polyhedron 2005 benchmarks compared to stock dragonegg and stock gcc 4.5.4. The runtime
>>>> benchmarks below show that we average slightly faster than stock gcc 4.5.4 and significantly
>>>> faster than stock dragonegg through the use of -fplugin-arg-dragonegg-enable-gcc-optzns.
>>>>
>>>> x86_64 darwin
>>>>
>>>> A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize
>>>> B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns
>>>> C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize
>>>>
>>>>
>>>> Benchmark     A) stock    B) gcc 4.5.4/    C) gcc 4.5.4/
>>>>                 gcc 4.5.4   dragonegg/optzns    dragonegg
>>>>
>>>> ac               9.58          9.13            12.30
>>>> aermod          20.88         16.10            17.62
>>>> air              6.16          6.59             7.70
>>>> capacita        35.68         39.94            46.22
>>>> channel          2.03          2.04             1.96
>>>> doduc           28.28         28.43            30.41
>>>> fatigue          8.13          7.19            10.40
>>>> gas_dyn         10.10          9.83            11.73
>>>> induct          20.17         20.76            48.76
>>>> linpk           15.42         15.65            15.69
>>>> mdbx            11.42         11.73            12.07
>>>> nf              27.99         28.60            29.39
>>>> protein         38.36         39.08            39.98
>>>> rnflow          27.28         28.19            31.90
>>>> test_fpu        11.43         11.17            11.50
>>>> tfft             1.91          1.95             2.16
>>>>
>>>> Mean            12.72         12.62            14.71
>>>>
>>>> Once vector_select() is implemented we can retest without -fno-tree-vectorize.
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



More information about the llvm-dev mailing list