[cfe-dev] Fwd: clang++ vs g++ compilation speed for ace-tao servants

Igor Vagulin igor.vagulin at gmail.com
Thu Nov 7 14:53:11 PST 2013


Hi Richard,

I've spend few days tweaking all possible flags on clang to make it
work close to gcc. At best found combination clang was only about 30%
worse than gcc(4.4 from rhel-6). Maybe you can give me another hint? I
really want to switch :).

I've create 100 copies of RemoveClusterObserver.cpp, below is test
results of "time g++|clang++ -c RemoveClusterObserver-*cpp"
g++: 0m58.510s
clang++ default opts: 2m4.065s
+ march=core2: 1m43.729s
+ BUILD_SHARED_LIBS=NO: 1m27.216s
+ by-clang-build-by-clang++: 1m18.816s

I've also trying measure compile time of clang sources, I thought this
case should be comfortable for clang. No luck, clang still worse in
every scenario. BTW here I also measured intel c compiler.
- gcc-4.4:
real    7m7.759s
user    45m53.622s
sys    2m3.657s
- icc 2013-sp1:
real    8m4.175s
user    54m39.116s
sys    2m35.159s
- clang compiled by gcc:
real    8m42.278s
user    60m19.175s
sys    0m51.341s
- clang compiled by icc (who is paying for this compiler? :-/):
real    8m2.399s
user    57m31.185s
sys    0m57.272s

Then I thought maybe problem is x86 architecture and lack of register
and switched to x86_64. Looks like that's the case, and people
claiming "clang compiles faster than gcc" mean "on x86_64".
- gcc:
real    8m8.230s
user    50m1.458s
sys    3m44.313s
- clang compiled by gcc:
real    7m57.747s
user    55m16.080s
sys    1m27.786s
- clang compiled by clang:
real    6m41.412s
user    44m53.298s
sys    1m27.715s
Igor Vagulin


On Fri, Nov 1, 2013 at 3:49 AM, Richard Smith <richard at metafoo.co.uk> wrote:
> Looks like your clang is built with shared libraries enabled. That will be
> hurting your performance somewhat, but I don't know how much. Try without
> -DBUILD_SHARED_LIBS=YES.
>
> Other than that, the only abnormally high cpu usage is within
> Sema::BuildBinOp, which will probably be due to the large number of
> overloads of operator<< etc that I observed earlier.
>
>
> On Thu, Oct 31, 2013 at 4:24 PM, Igor Vagulin <igor.vagulin at gmail.com>
> wrote:
>>
>> I've created sysprof profile of compilation (attached, better open
>> with sysprof it show nice tree). Doesn't look like there are any
>> obvious bottleneck to me. Time in
>> clang::Parser::ParseExternalDeclaration distributed pretty equaly
>> between parts. Maybe I miss something?
>>
>> Igor Vagulin
>>
>>
>> On Thu, Oct 31, 2013 at 11:20 PM, Richard Smith <richard at metafoo.co.uk>
>> wrote:
>> > We may also be able to pick out an 'obvious winner' (perhaps looking for
>> > one
>> > that only requires standard conversions) before trying to build
>> > conversion
>> > sequences for all candidates.
>> >
>> >
>> > On Thu, Oct 31, 2013 at 12:19 PM, Richard Smith <richard at metafoo.co.uk>
>> > wrote:
>> >>
>> >> I suspect the problem is overload resolution for the several hundred
>> >> overloads of each of 'operator<<', 'operator<<=', 'operator>>', and
>> >> 'operator>>=' that are present here. Many of these have the same LHS
>> >> parameter type; we could probably improve performance here by caching
>> >> the
>> >> computation of an implicit conversion sequence for a given (argument,
>> >> parameter type) pair.
>> >>
>> >>
>> >> On Thu, Oct 31, 2013 at 10:05 AM, David Blaikie <dblaikie at gmail.com>
>> >> wrote:
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Wed, Oct 30, 2013 at 11:59 PM, Igor Vagulin
>> >>> <igor.vagulin at gmail.com>
>> >>> wrote:
>> >>>>
>> >>>> Hi All,
>> >>>>
>> >>>> We are evaluating switch to clang from gcc for our c++ application.
>> >>>> Main focus is compilation speed, but we also look at address/memory
>> >>>> sanitizer, c++11 support and c++ modules. I've tried to compile our
>> >>>> project with clang++ but resulting compilation time is more than with
>> >>>> gcc. Can someone give me a hint where might be a problem?
>> >>>>
>> >>>> Our project is bunch of ace-tao corba servants. Overall time with
>> >>>> clang++ is about twice more than with gcc. To reproduce problem I
>> >>>> preprocesssed one file and then compile it, result same - twice
>> >>>> longer
>> >>>> compilation. Don't know where to look further.
>> >>>
>> >>>
>> >>> If you're lucky, someone might look for you if you attach (or link to,
>> >>> if
>> >>> it's too big to attach) an example of the problem.
>> >>>
>> >>> Otherwise, you might want to get started with a profiler and see where
>> >>> the hot parts of Clang/LLVM are in your example.
>> >>>
>> >>> (also, consider trying with Clang top of tree (straight from svn/git)
>> >>> -
>> >>> the project moves fairly quickly)
>> >>>
>> >>>>
>> >>>> [root at ivagulin-pc ~]# time clang++ -c RemoveClusterObserverClang.cpp
>> >>>> real    0m1.283s
>> >>>> user    0m1.254s
>> >>>> sys    0m0.024s
>> >>>> [root at ivagulin-pc ~]# time g++ -c RemoveClusterObserverGcc.cpp
>> >>>> real    0m0.576s
>> >>>> user    0m0.524s
>> >>>> sys    0m0.048s
>> >>>>
>> >>>> I use llvm-3.3 and cfe-3.3 compiled with folowing options. Sources of
>> >>>> RemoveClusterObserver*cpp attached.
>> >>>> + cmake -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=Release
>> >>>> '-DCMAKE_CXX_FLAGS_RELEASE=-O3 -g -mtune=amdfam10 -march=i686'
>> >>>> '-DCMAKE_C_FLAGS_RELEASE=-O3 -g -mtune=amdfam10 -march=i686'
>> >>>> -DCMAKE_EXE_LINKER_FLAGS_RELEASE=-Wl,--as-needed -Wl,--strip-all'
>> >>>> '-DCMAKE_MODULE_LINKER_FLAGS_RELEASE=-Wl,--as-needed -Wl,--strip-all'
>> >>>> '-DCMAKE_SHARED_LINKER_FLAGS_RELEASE=-Wl,--as-needed -Wl,--strip-all
>> >>>> -shared' -DCMAKE_SKIP_RPATH=YES -DBUILD_SHARED_LIBS=YES
>> >>>> -DLLVM_ENABLE_TIMESTAMPS=NO ..
>> >>>>
>> >>>> Igor Vagulin
>> >>>>
>> >>>> _______________________________________________
>> >>>> cfe-dev mailing list
>> >>>> cfe-dev at cs.uiuc.edu
>> >>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>> >>>>
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> cfe-dev mailing list
>> >>> cfe-dev at cs.uiuc.edu
>> >>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>> >>>
>> >>
>> >
>
>



More information about the cfe-dev mailing list