[Openmp-dev] Phoronix numbers for clang-omp compiler

Alexey Bataev a.bataev at hotmail.com
Thu Jun 5 05:43:40 PDT 2014


Jack,
Yes, this is true, but I don't think that it is a good idea to compare 
performance between code compiled with -O2 and compiled with -O0.

Best regards,
Alexey Bataev
=============
Software Engineer
Intel Compiler Team

5 Июнь 2014 г. 16:06:32, Jack Howarth писал:
> Alexey,
>       This wouldn't be the first time that Phoronix has gotten sloppy
> with using a common set of optimization flags in their gcc vs clang
> benchmarks. Although, it is getting harder to make direct comparisons.
> Clang now autovectorizes at -O2 while, as you can see from the
> my prior posting of the assembly from-fverbose-asm -O2, FSF gcc 4.9.0
> doesn't.
>            Jack
>
>
>
> On Wed, Jun 4, 2014 at 10:32 PM, Alexey Bataev <a.bataev at hotmail.com
> <mailto:a.bataev at hotmail.com>> wrote:
>
>     Jack,
>     Actually everything is quite simple. Config files for Phoronix
>     test suites are using gcc by default and -O2 is provided by
>     default in this config files. I think they just compiled these
>     tests by gcc with the default config (which already includes -O2)
>     while for clang they had to specify some custom options (like
>     "-fopenmp -lgomp", though the second one is not needed), but
>     forget to add any optimization options.
>
>     Best regards,
>     Alexey Bataev
>     =============
>
>     Software Engineer
>     Intel Compiler Team
>
>     04.06.2014 22:19, openmp-dev-request at cs.uiuc.edu
>     <mailto:openmp-dev-request at cs.uiuc.edu> пишет:
>
>         Message: 4
>         Date: Wed, 4 Jun 2014 14:19:49 -0400
>         From: Jack Howarth<howarth.mailing.lists at __gmail.com
>         <mailto:howarth.mailing.lists at gmail.com>>
>         To: Andrey Bokhanko<andreybokhanko at gmail.__com
>         <mailto:andreybokhanko at gmail.com>>
>         Cc:"openmp-dev at dcs-maillist2.__engr.illinois.edu
>         <mailto:openmp-dev at dcs-maillist2.engr.illinois.edu>"
>                 <openmp-dev at dcs-maillist2.__engr.illinois.edu
>         <mailto:openmp-dev at dcs-maillist2.engr.illinois.edu>>
>         Subject: Re: [Openmp-dev] Phoronix numbers for clang-omp compiler
>         Message-ID:
>
>         <CADtEn-__2YdnYNjg3bOcbGxJCJB87fdPyq1oAo__k4ZFoBTD33Y8Ag at mail.gmail.com
>         <mailto:CADtEn-2YdnYNjg3bOcbGxJCJB87fdPyq1oAok4ZFoBTD33Y8Ag at mail.gmail.com>>
>         Content-Type: text/plain; charset="utf-8"
>
>
>         Andrey,
>               FSF gcc is not exactly defaulting to -O2 but certainly
>         is higher than
>         the default optimizations on clang?.
>
>
>         % touch t.cc
>         % g++-fsf-4.9 -fverbose-asm t.cc -S
>         % more t.s
>         # GNU C++ (GCC) version 4.9.0 (x86_64-apple-darwin11.4.2)
>         #       compiled by GNU C version 4.9.0, GMP version 6.0.0,
>         MPFR version
>         3.1.2, MPC version 1.0.2
>         # GGC heuristics: --param ggc-min-expand=100 --param
>         ggc-min-heapsize=131072
>         # options passed:  -D__DYNAMIC__ t.cc -fPIC
>         -mmacosx-version-min=10.7.4
>         # -mtune=core2 -fverbose-asm
>         # options enabled:  -Wnonportable-cfstrings -fPIC
>         # -faggressive-loop-__optimizations -fasynchronous-unwind-tables
>         # -fauto-inc-dec -fcommon -fdelete-null-pointer-checks
>         -fearly-inlining
>         # -feliminate-unused-debug-types -fexceptions -ffunction-cse
>         -fgcse-lm
>         # -fgnu-unique -fident -finline-atomics -fira-hoist-pressure
>         # -fira-share-save-slots -fira-share-spill-slots -fivopts
>         # -fkeep-static-consts -fleading-underscore -fmath-errno
>         # -fmerge-debug-strings -fnext-runtime -fobjc-abi-version=
>         -fpeephole
>         # -fprefetch-loop-arrays -freg-struct-return
>         # -fsched-critical-path-__heuristic -fsched-dep-count-heuristic
>         # -fsched-group-heuristic -fsched-interblock
>         -fsched-last-insn-heuristic
>         # -fsched-rank-heuristic -fsched-spec -fsched-spec-insn-heuristic
>         # -fsched-stalled-insns-dep -fshow-column -fsigned-zeros
>         # -fsplit-ivs-in-unroller -fstrict-volatile-bitfields
>         -fsync-libcalls
>         # -ftrapping-math -ftree-coalesce-vars -ftree-cselim
>         -ftree-forwprop
>         # -ftree-loop-if-convert -ftree-loop-im -ftree-loop-ivcanon
>         # -ftree-loop-optimize -ftree-parallelize-loops= -ftree-phiprop
>         # -ftree-reassoc -ftree-scev-cprop -funit-at-a-time
>         -funwind-tables
>         # -fverbose-asm -fzero-initialized-in-bss -gstrict-dwarf
>         # -m128bit-long-double -m64 -m80387 -malign-stringops -matt-stubs
>         # -mconstant-cfstrings -mfancy-math-387 -mfp-ret-in-387 -mfxsr
>         -mieee-fp
>         # -mlong-double-80 -mmmx -mno-sse4 -mpush-args -mred-zone
>         -msse -msse2
>         # -msse3
>
>                  .constructor
>                  .destructor
>                  .align 1
>                  .subsections_via_symbols
>
>         % g++-fsf-4.9 -fverbose-asm -O2 t.cc -S
>         % more t.s
>         # GNU C++ (GCC) version 4.9.0 (x86_64-apple-darwin11.4.2)
>         #       compiled by GNU C version 4.9.0, GMP version 6.0.0,
>         MPFR version
>         3.1.2, MPC version 1.0.2
>         # GGC heuristics: --param ggc-min-expand=100 --param
>         ggc-min-heapsize=131072
>         # options passed:  -D__DYNAMIC__ t.cc -fPIC
>         -mmacosx-version-min=10.7.4
>         # -mtune=core2 -O2 -fverbose-asm
>         # options enabled:  -Wnonportable-cfstrings -fPIC
>         # -faggressive-loop-__optimizations -fasynchronous-unwind-tables
>         # -fauto-inc-dec -fbranch-count-reg -fcaller-saves
>         # -fcombine-stack-adjustments -fcommon -fcompare-elim
>         -fcprop-registers
>         # -fcrossjumping -fcse-follow-jumps -fdefer-pop
>         # -fdelete-null-pointer-checks -fdevirtualize
>         -fdevirtualize-speculatively
>         # -fearly-inlining -feliminate-unused-debug-types -fexceptions
>         # -fexpensive-optimizations -fforward-propagate -ffunction-cse
>         -fgcse
>         # -fgcse-lm -fgnu-unique -fguess-branch-probability
>         -fhoist-adjacent-loads
>         # -fident -fif-conversion -fif-conversion2 -findirect-inlining
>         -finline
>         # -finline-atomics -finline-functions-called-once
>         -finline-small-functions
>         # -fipa-cp -fipa-profile -fipa-pure-const -fipa-reference
>         -fipa-sra
>         # -fira-hoist-pressure -fira-share-save-slots
>         -fira-share-spill-slots
>         # -fisolate-erroneous-paths-__dereference -fivopts
>         -fkeep-static-consts
>         # -fleading-underscore -fmath-errno -fmerge-constants
>         -fmerge-debug-strings
>         # -fmove-loop-invariants -fnext-runtime -fobjc-abi-version=
>         # -fomit-frame-pointer -foptimize-sibling-calls -foptimize-strlen
>         # -fpartial-inlining -fpeephole -fpeephole2
>         -fprefetch-loop-arrays -free
>         # -freg-struct-return -freorder-blocks
>         -freorder-blocks-and-partition
>         # -freorder-functions -frerun-cse-after-loop
>         # -fsched-critical-path-__heuristic -fsched-dep-count-heuristic
>         # -fsched-group-heuristic -fsched-interblock
>         -fsched-last-insn-heuristic
>         # -fsched-rank-heuristic -fsched-spec -fsched-spec-insn-heuristic
>         # -fsched-stalled-insns-dep -fschedule-insns2 -fshow-column
>         -fshrink-wrap
>         # -fsigned-zeros -fsplit-ivs-in-unroller -fsplit-wide-types
>         # -fstrict-aliasing -fstrict-overflow -fstrict-volatile-bitfields
>         # -fsync-libcalls -fthread-jumps -ftoplevel-reorder
>         -ftrapping-math
>         # -ftree-bit-ccp -ftree-builtin-call-dce -ftree-ccp -ftree-ch
>         # -ftree-coalesce-vars -ftree-copy-prop -ftree-copyrename
>         -ftree-cselim
>         # -ftree-dce -ftree-dominator-opts -ftree-dse -ftree-forwprop
>         -ftree-fre
>         # -ftree-loop-if-convert -ftree-loop-im -ftree-loop-ivcanon
>         # -ftree-loop-optimize -ftree-parallelize-loops=
>         -ftree-phiprop -ftree-pre
>         # -ftree-pta -ftree-reassoc -ftree-scev-cprop -ftree-sink
>         -ftree-slsr
>         # -ftree-sra -ftree-switch-conversion -ftree-tail-merge -ftree-ter
>         # -ftree-vrp -funit-at-a-time -funwind-tables -fverbose-asm
>         # -fzero-initialized-in-bss -gstrict-dwarf
>         -m128bit-long-double -m64
>         # -m80387 -malign-stringops -matt-stubs -mconstant-cfstrings
>         # -mfancy-math-387 -mfp-ret-in-387 -mfxsr -mieee-fp
>         -mlong-double-80 -mmmx
>         # -mno-sse4 -mpush-args -mred-zone -msse -msse2 -msse3
>         -mvzeroupper
>
>                  .constructor
>                  .destructor
>                  .align 1
>                  .subsections_via_symbols
>
>         You might consider filing an enhancement request for clang 3.5
>         to have the
>         default behavior without optimization flags bumped up to -O1.
>                   Jack
>
>
>




More information about the Openmp-dev mailing list