[PATCH] Add benchmarking-only mode to the test suite

Yi Kong kongy.dev at gmail.com
Sat May 17 05:08:24 PDT 2014


On 16 May 2014 15:25, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
>> From: "Yi Kong" <kongy.dev at gmail.com>
>> To: "Hal Finkel" <hfinkel at anl.gov>
>> Cc: "Eric Christopher" <echristo at gmail.com>, "llvm-commits" <llvm-commits at cs.uiuc.edu>, "Tobias Grosser"
>> <tobias at grosser.es>
>> Sent: Thursday, May 15, 2014 5:41:04 PM
>> Subject: Re: [PATCH] Add benchmarking-only mode to the test suite
>>
>> On 15 May 2014 13:59, Hal Finkel <hfinkel at anl.gov> wrote:
>> > ----- Original Message -----
>> >> From: "Yi Kong" <kongy.dev at gmail.com>
>> >> To: "Hal Finkel" <hfinkel at anl.gov>
>> >> Cc: "Eric Christopher" <echristo at gmail.com>, "llvm-commits"
>> >> <llvm-commits at cs.uiuc.edu>, "Tobias Grosser"
>> >> <tobias at grosser.es>
>> >> Sent: Thursday, May 15, 2014 5:26:54 AM
>> >> Subject: Re: [PATCH] Add benchmarking-only mode to the test suite
>> >>
>> >> Hi Hal Finkel,
>> >>
>> >> What's the criteria you use to to decide useful benchmarks?
>> >
>> > Please refer to the LLVMDev thread "[RFC] Benchmarking subset of
>> > the test suite" in which I explain my methadology in detail.
>>
>> I think the approach you've taken is indeed sensible. However I don't
>> really agree with your make -j6 option. The Xeon chip you are testing
>> on only has 4 core, which means a lot of context switch happens. The
>
> It is a dual-socket machine.
>
>> noise produced by that would be far too great for "normal"
>> environment. Also I believe that the testing machine should be as
>> quiet as possible, otherwise we are actually measuring the noise!
>
> This is obviously ideal, but rarely possible in practice. More to the point, the buildbots are not quiet, but we still want to be able to extract execution-time changes from them without a large number of false positives. Some tests are just too sensitive to I/O time, or are too short, for this to be possible (because you really are just seeing the noise), and this exclusion list is meant to exclude such test. Given a sufficient number of samples (10, for example), I've confirmed that it is possible to extract meaningful timing differences from the others at high confidence.
>
>>
>> I've been investigating the timeit tool in test suite. It turns out
>> to
>> be really inaccurate, and sometimes it's the main source of noise we
>> are seeing. I've implemented using Linux perf tool to measure time.
>> So
>> far it seems to produce much better results. I will publish the
>> finding with the patch in a separate thread once I've gathered enough
>> data points. Maybe with the more accurate timing tool, we might get a
>> different picture.
>
> That's great!
>
>>
>> >
>> >>
>> >> I suggest you to also have a look at the standard deviation or
>> >> MAD.
>> >
>> > Of course this has already been considered and taken into account
>> > ;)
>> >
>> >> Some of the tests have really large variance that we may not want
>> >> to
>> >> include when benchmarking, eg.
>> >> Polybench/linear-algebra/kernels/3mm/3mm. I've attached a patch
>> >> which
>> >> makes tables sortable so that it is easier to investigate.
>> >
>> > If you feel that there is a test or tests that have too large of a
>> > variance for useful benchmarking, please compose a list, explain
>> > your criteria, and we'll merge in some useful way.
>>
>> Mainly Polybench/linear-algebra, but I can't give you the list right
>> now as LLVM LNT site is down again.

These 5 tests have really large MAD on various testing machines, even
with perf tools. Please add them the the exclusion list.
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemm/gemm
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm
MultiSource/Benchmarks/nbench/nbench

Otherwise, it looks alright to me as well.

>
> That seems odd to me. Do you know why?

No. I haven't been able to reproduce the problem on my local machine.
Maybe Chris knows why.

>
> Thanks again,
> Hal
>
>>
>> >
>> >>
>> >> I see you categorized some tests as not useful for benchmarking,
>> >> maybe
>> >> a better way to do is to extend running time by increasing problem
>> >> size? For example, blowfish can be modified to have more Feistel
>> >> rounds.
>> >
>> > Yes, this is certainly true. My goal here is to represent things as
>> > they are now; as tests are updated to be more useful for
>> > benchmarking, they can be removed from the exclusion list.
>>
>> Sure.
>>
>> >
>> > Thanks for the feedback!
>> >
>> >  -Hal
>> >
>> >>
>> >> Cheers,
>> >> Yi Kong
>> >>
>> >> On 14 May 2014 17:34, Hal Finkel <hfinkel at anl.gov> wrote:
>> >> > ----- Original Message -----
>> >> >> From: "Eric Christopher" <echristo at gmail.com>
>> >> >> To: "Hal Finkel" <hfinkel at anl.gov>
>> >> >> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>
>> >> >> Sent: Wednesday, May 14, 2014 11:33:15 AM
>> >> >> Subject: Re: [PATCH] Add benchmarking-only mode to the test
>> >> >> suite
>> >> >>
>> >> >> This looks good to me. One request would be for a
>> >> >> README.BENCHMARKING
>> >> >> file at the top that describes what's going on and what someone
>> >> >> should
>> >> >> do when adding a new test to the testsuite.
>> >> >
>> >> > That's a good idea; will do. Thanks!
>> >> >
>> >> >  -Hal
>> >> >
>> >> >>
>> >> >> -eric
>> >> >>
>> >> >> On Sun, May 11, 2014 at 4:39 PM, Hal Finkel <hfinkel at anl.gov>
>> >> >> wrote:
>> >> >> > Hello everyone,
>> >> >> >
>> >> >> > To follow-up on last week's thread on adding a
>> >> >> > benchmarking-only
>> >> >> > mode to the test suite, which excludes those programs
>> >> >> > unlikely
>> >> >> > to
>> >> >> > use useful as benchmarks, here's a patch which provides an
>> >> >> > implementation. When BENCHMARKING_ONLY is defined, the
>> >> >> > following
>> >> >> > programs should be excluded:
>> >> >> >
>> >> >> > MultiSource/Applications/Burg/burg
>> >> >> > MultiSource/Applications/treecc/treecc
>> >> >> > MultiSource/Benchmarks/MiBench/office-ispell/office-ispell
>> >> >> > MultiSource/Benchmarks/MiBench/security-blowfish/security-blowfish
>> >> >> > MultiSource/Benchmarks/Prolangs-C/allroots/allroots
>> >> >> > MultiSource/Benchmarks/Prolangs-C/archie-client/archie
>> >> >> > MultiSource/Benchmarks/Prolangs-C/assembler/assembler
>> >> >> > MultiSource/Benchmarks/Prolangs-C/compiler/compiler
>> >> >> > MultiSource/Benchmarks/Prolangs-C++/deriv1/deriv1
>> >> >> > MultiSource/Benchmarks/Prolangs-C++/deriv2/deriv2
>> >> >> > MultiSource/Benchmarks/Prolangs-C++/family/family
>> >> >> > MultiSource/Benchmarks/Prolangs-C/fixoutput/fixoutput
>> >> >> > MultiSource/Benchmarks/Prolangs-C/football/football
>> >> >> > MultiSource/Benchmarks/Prolangs-C++/fsm/fsm
>> >> >> > MultiSource/Benchmarks/Prolangs-C++/garage/garage
>> >> >> > MultiSource/Benchmarks/Prolangs-C++/NP/np
>> >> >> > MultiSource/Benchmarks/Prolangs-C++/objects/objects
>> >> >> > MultiSource/Benchmarks/Prolangs-C++/office/office
>> >> >> > MultiSource/Benchmarks/Prolangs-C/plot2fig/plot2fig
>> >> >> > MultiSource/Benchmarks/Prolangs-C++/shapes/shapes
>> >> >> > MultiSource/Benchmarks/Prolangs-C/simulator/simulator
>> >> >> > MultiSource/Benchmarks/Prolangs-C/TimberWolfMC/timberwolfmc
>> >> >> > MultiSource/Benchmarks/Prolangs-C++/trees/trees
>> >> >> > MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail
>> >> >> > MultiSource/Benchmarks/Prolangs-C/unix-tbl/unix-tbl
>> >> >> > MultiSource/Benchmarks/Prolangs-C++/vcirc/vcirc
>> >> >> > SingleSource/Benchmarks/McGill/exptree
>> >> >> > SingleSource/Benchmarks/Shootout-C++/hello
>> >> >> > SingleSource/Benchmarks/Shootout-C++/reversefile
>> >> >> > SingleSource/Benchmarks/Shootout-C++/spellcheck
>> >> >> > SingleSource/Benchmarks/Shootout-C++/sumcol
>> >> >> > SingleSource/Benchmarks/Shootout-C++/wc
>> >> >> > SingleSource/Benchmarks/Shootout-C++/wordfreq
>> >> >> > SingleSource/Benchmarks/Shootout/hello
>> >> >> > SingleSource/Regression/C++/2003-05-14-array-init
>> >> >> > SingleSource/Regression/C++/2003-05-14-expr_stmt
>> >> >> > SingleSource/Regression/C/2003-05-14-initialize-string
>> >> >> > SingleSource/Regression/C/2003-05-21-BitfieldHandling
>> >> >> > SingleSource/Regression/C/2003-05-21-UnionBitfields
>> >> >> > SingleSource/Regression/C/2003-05-21-UnionTest
>> >> >> > SingleSource/Regression/C/2003-05-22-LocalTypeTest
>> >> >> > SingleSource/Regression/C/2003-05-22-VarSizeArray
>> >> >> > SingleSource/Regression/C/2003-05-23-TransparentUnion
>> >> >> > SingleSource/Regression/C++/2003-06-08-BaseType
>> >> >> > SingleSource/Regression/C++/2003-06-08-VirtualFunctions
>> >> >> > SingleSource/Regression/C++/2003-06-13-Crasher
>> >> >> > SingleSource/Regression/C/2003-06-16-InvalidInitializer
>> >> >> > SingleSource/Regression/C/2003-06-16-VolatileError
>> >> >> > SingleSource/Regression/C++/2003-08-20-EnumSizeProblem
>> >> >> > SingleSource/Regression/C++/2003-09-29-NonPODsByValue
>> >> >> > SingleSource/Regression/C/2003-10-12-GlobalVarInitializers
>> >> >> > SingleSource/Regression/C/2004-02-03-AggregateCopy
>> >> >> > SingleSource/Regression/C/2004-03-15-IndirectGoto
>> >> >> > SingleSource/Regression/C/2005-05-06-LongLongSignedShift
>> >> >> > SingleSource/Regression/C/2008-01-07-LongDouble
>> >> >> > SingleSource/Regression/C++/2008-01-29-ParamAliasesReturn
>> >> >> > SingleSource/Regression/C++/2011-03-28-Bitfield
>> >> >> > SingleSource/Regression/C/badidx
>> >> >> > SingleSource/Regression/C/bigstack
>> >> >> > SingleSource/Regression/C++/BuiltinTypeInfo
>> >> >> > SingleSource/Regression/C/callargs
>> >> >> > SingleSource/Regression/C/casts
>> >> >> > SingleSource/Regression/C/compare
>> >> >> > SingleSource/Regression/C/ConstructorDestructorAttributes
>> >> >> > SingleSource/Regression/C/DuffsDevice
>> >> >> > SingleSource/Regression/C++/EH/class_hierarchy
>> >> >> > SingleSource/Regression/C++/EH/ConditionalExpr
>> >> >> > SingleSource/Regression/C++/EH/ctor_dtor_count
>> >> >> > SingleSource/Regression/C++/EH/ctor_dtor_count-2
>> >> >> > SingleSource/Regression/C++/EH/dead_try_block
>> >> >> > SingleSource/Regression/C++/EH/exception_spec_test
>> >> >> > SingleSource/Regression/C++/EH/function_try_block
>> >> >> > SingleSource/Regression/C++/EH/inlined_cleanup
>> >> >> > SingleSource/Regression/C++/EH/recursive-throw
>> >> >> > SingleSource/Regression/C++/EH/simple_rethrow
>> >> >> > SingleSource/Regression/C++/EH/simple_throw
>> >> >> > SingleSource/Regression/C++/EH/throw_rethrow_test
>> >> >> > SingleSource/Regression/C++/fixups
>> >> >> > SingleSource/Regression/C++/global_ctor
>> >> >> > SingleSource/Regression/C/globalrefs
>> >> >> > SingleSource/Regression/C++/global_type
>> >> >> > SingleSource/Regression/C++/ofstream_ctor
>> >> >> > SingleSource/Regression/C/pointer_arithmetic
>> >> >> > SingleSource/Regression/C++/pointer_member
>> >> >> > SingleSource/Regression/C++/pointer_method
>> >> >> > SingleSource/Regression/C++/pointer_method2
>> >> >> > SingleSource/Regression/C/PR10189
>> >> >> > SingleSource/Regression/C/PR1386
>> >> >> > SingleSource/Regression/C/PR491
>> >> >> > SingleSource/Regression/C/PR640
>> >> >> > SingleSource/Regression/C++/short_circuit_dtor
>> >> >> > SingleSource/Regression/C/sumarray
>> >> >> > SingleSource/Regression/C/sumarraymalloc
>> >> >> > SingleSource/Regression/C/testtrace
>> >> >> > SingleSource/UnitTests/2002-04-17-PrintfChar
>> >> >> > SingleSource/UnitTests/2002-05-02-ArgumentTest
>> >> >> > SingleSource/UnitTests/2002-05-02-CastTest
>> >> >> > SingleSource/UnitTests/2002-05-02-CastTest1
>> >> >> > SingleSource/UnitTests/2002-05-02-CastTest2
>> >> >> > SingleSource/UnitTests/2002-05-02-CastTest3
>> >> >> > SingleSource/UnitTests/2002-05-02-ManyArguments
>> >> >> > SingleSource/UnitTests/2002-05-03-NotTest
>> >> >> > SingleSource/UnitTests/2002-05-19-DivTest
>> >> >> > SingleSource/UnitTests/2002-08-02-CastTest
>> >> >> > SingleSource/UnitTests/2002-08-02-CastTest2
>> >> >> > SingleSource/UnitTests/2002-08-19-CodegenBug
>> >> >> > SingleSource/UnitTests/2002-10-09-ArrayResolution
>> >> >> > SingleSource/UnitTests/2002-10-12-StructureArgs
>> >> >> > SingleSource/UnitTests/2002-10-12-StructureArgsSimple
>> >> >> > SingleSource/UnitTests/2002-10-13-BadLoad
>> >> >> > SingleSource/UnitTests/2002-12-13-MishaTest
>> >> >> > SingleSource/UnitTests/2003-04-22-Switch
>> >> >> > SingleSource/UnitTests/2003-05-02-DependentPHI
>> >> >> > SingleSource/UnitTests/2003-05-07-VarArgs
>> >> >> > SingleSource/UnitTests/2003-05-12-MinIntProblem
>> >> >> > SingleSource/UnitTests/2003-05-14-AtExit
>> >> >> > SingleSource/UnitTests/2003-05-26-Shorts
>> >> >> > SingleSource/UnitTests/2003-05-31-CastToBool
>> >> >> > SingleSource/UnitTests/2003-05-31-LongShifts
>> >> >> > SingleSource/UnitTests/2003-07-06-IntOverflow
>> >> >> > SingleSource/UnitTests/2003-07-08-BitOpsTest
>> >> >> > SingleSource/UnitTests/2003-07-09-LoadShorts
>> >> >> > SingleSource/UnitTests/2003-07-09-SignedArgs
>> >> >> > SingleSource/UnitTests/2003-07-10-SignConversions
>> >> >> > SingleSource/UnitTests/2003-08-05-CastFPToUint
>> >> >> > SingleSource/UnitTests/2003-08-11-VaListArg
>> >> >> > SingleSource/UnitTests/2003-08-20-FoldBug
>> >> >> > SingleSource/UnitTests/2003-09-18-BitFieldTest
>> >> >> > SingleSource/UnitTests/2003-10-13-SwitchTest
>> >> >> > SingleSource/UnitTests/2003-10-29-ScalarReplBug
>> >> >> > SingleSource/UnitTests/2004-02-02-NegativeZero
>> >> >> > SingleSource/UnitTests/2004-06-20-StaticBitfieldInit
>> >> >> > SingleSource/UnitTests/2004-11-28-GlobalBoolLayout
>> >> >> > SingleSource/UnitTests/2005-05-11-Popcount-ffs-fls
>> >> >> > SingleSource/UnitTests/2005-05-12-Int64ToFP
>> >> >> > SingleSource/UnitTests/2005-05-13-SDivTwo
>> >> >> > SingleSource/UnitTests/2005-07-17-INT-To-FP
>> >> >> > SingleSource/UnitTests/2005-11-29-LongSwitch
>> >> >> > SingleSource/UnitTests/2006-01-29-SimpleIndirectCall
>> >> >> > SingleSource/UnitTests/2006-02-04-DivRem
>> >> >> > SingleSource/UnitTests/2006-12-01-float_varg
>> >> >> > SingleSource/UnitTests/2006-12-04-DynAllocAndRestore
>> >> >> > SingleSource/UnitTests/2006-12-07-Compare64BitConstant
>> >> >> > SingleSource/UnitTests/2006-12-11-LoadConstants
>> >> >> > SingleSource/UnitTests/2007-01-04-KNR-Args
>> >> >> > SingleSource/UnitTests/2007-03-02-VaCopy
>> >> >> > SingleSource/UnitTests/2007-04-25-weak
>> >> >> > SingleSource/UnitTests/2008-04-18-LoopBug
>> >> >> > SingleSource/UnitTests/2008-04-20-LoopBug2
>> >> >> > SingleSource/UnitTests/2008-07-13-InlineSetjmp
>> >> >> > SingleSource/UnitTests/2009-04-16-BitfieldInitialization
>> >> >> > SingleSource/UnitTests/2009-12-07-StructReturn
>> >> >> > SingleSource/UnitTests/2010-05-24-BitfieldTest
>> >> >> > SingleSource/UnitTests/AtomicOps
>> >> >> > SingleSource/UnitTests/block-byref-cxxobj-test
>> >> >> > SingleSource/UnitTests/block-byref-test
>> >> >> > SingleSource/UnitTests/block-call-r7674133
>> >> >> > SingleSource/UnitTests/block-copied-in-cxxobj
>> >> >> > SingleSource/UnitTests/block-copied-in-cxxobj-1
>> >> >> > SingleSource/UnitTests/blockstret
>> >> >> > SingleSource/UnitTests/byval-alignment
>> >> >> > SingleSource/UnitTests/conditional-gnu-ext
>> >> >> > SingleSource/UnitTests/conditional-gnu-ext-cxx
>> >> >> > SingleSource/UnitTests/DefaultInitDynArrays
>> >> >> > SingleSource/UnitTests/FloatPrecision
>> >> >> > SingleSource/UnitTests/initp1
>> >> >> > SingleSource/UnitTests/member-function-pointers
>> >> >> > SingleSource/UnitTests/printargs
>> >> >> > SingleSource/UnitTests/SignlessTypes/cast2
>> >> >> > SingleSource/UnitTests/SignlessTypes/cast-bug
>> >> >> > SingleSource/UnitTests/SignlessTypes/ccc
>> >> >> > SingleSource/UnitTests/SignlessTypes/div
>> >> >> > SingleSource/UnitTests/SignlessTypes/factor
>> >> >> > SingleSource/UnitTests/SignlessTypes/Large/cast
>> >> >> > SingleSource/UnitTests/SignlessTypes/shr
>> >> >> > SingleSource/UnitTests/stmtexpr
>> >> >> > SingleSource/UnitTests/StructModifyTest
>> >> >> > SingleSource/UnitTests/TestLoop
>> >> >> > SingleSource/UnitTests/Threads/2010-12-08-tls
>> >> >> > SingleSource/UnitTests/Threads/tls
>> >> >> > SingleSource/UnitTests/Vector/build
>> >> >> > SingleSource/UnitTests/Vector/divides
>> >> >> > SingleSource/UnitTests/Vector/sumarray
>> >> >> > SingleSource/UnitTests/Vector/sumarray-dbl
>> >> >> > SingleSource/UnitTests/vla
>> >> >> >
>> >> >> > SingleSource/UnitTests/Vector/Altivec/2007-01-07-lvsl-lvsr-Regression
>> >> >> > SingleSource/UnitTests/Vector/Altivec/alti.sdot
>> >> >> > SingleSource/UnitTests/Vector/Altivec/casts
>> >> >> > SingleSource/UnitTests/Vector/Altivec/test1
>> >> >> >
>> >> >> > SingleSource/UnitTests/ms_struct-bitfield
>> >> >> > SingleSource/UnitTests/ms_struct-bitfield-1
>> >> >> > SingleSource/UnitTests/ms_struct-bitfield-init
>> >> >> > SingleSource/UnitTests/ms_struct-bitfield-init-1
>> >> >> > SingleSource/UnitTests/ms_struct_pack_layout
>> >> >> > SingleSource/UnitTests/ms_struct_pack_layout-1
>> >> >> >
>> >> >> > Please review.
>> >> >> >
>> >> >> > Thanks again,
>> >> >> > Hal
>> >> >> >
>> >> >> > --
>> >> >> > Hal Finkel
>> >> >> > Assistant Computational Scientist
>> >> >> > Leadership Computing Facility
>> >> >> > Argonne National Laboratory
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > llvm-commits mailing list
>> >> >> > llvm-commits at cs.uiuc.edu
>> >> >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> >> >> >
>> >> >>
>> >> >
>> >> > --
>> >> > Hal Finkel
>> >> > Assistant Computational Scientist
>> >> > Leadership Computing Facility
>> >> > Argonne National Laboratory
>> >> > _______________________________________________
>> >> > llvm-commits mailing list
>> >> > llvm-commits at cs.uiuc.edu
>> >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> >>
>> >
>> > --
>> > Hal Finkel
>> > Assistant Computational Scientist
>> > Leadership Computing Facility
>> > Argonne National Laboratory
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory




More information about the llvm-commits mailing list