[PATCH][TEST-SUITE] Marking more programs as not to be run in benchmark-only mode

Tue May 19 10:28:11 PDT 2015

Hi,

One substantial source of performance noisiness of the LNT test-suite are
programs running
very shortly.
The attached patch disables more of the programs in benchmark-only mode,
based on analysis
that I've done on the programs that run less than 10ms on
http://llvm.org/perf/db_default/v4/nts/machine/39. 

I think it is debatable whether some of these should remain to be run in
benchmark-only mode,
as we also have code in LNT to just ignore all benchmarks running less than
10 ms.

Here is the list of programs that are removed in benchmark-only mode by the
attached patch.

The programs that clearly don't have value as a benchmark:
* SingleSource/UnitTests/Vector: constpool simple:
  both don't have any loops in the code.
* SingleSource/UnitTests/Vector/AArch64: aarch64_neon_intrinsics:
  doesn't have any loop in the code.
* SingleSource/UnitTests/Vector/NEON: simple:
  doesn't have any loop in the code.
* SingleSource/UnitTests: 2005-07-15-Bitfield-ABI 2006-01-23-UnionInit
2007-04-10-BitfieldTest:
  doesn't have any loop in the code.
* MultiSource/Benchmarks/Prolangs-C: loader:
  This program exits immediately because no arguments are given on the
command line. Unless
  someone creates inputs for this program, this should not be considered a
benchmark.

The programs for which it's debatable whether they have value or not.
I think none of these do enough work to be considered as a benchmark:
* MultiSource/Benchmarks/McCat/:
  - 15-trie produces trie data structures, but the program has no loops - so
it's probably IO
    bound, and therefore shouldn't be considered as a benchmark.
* MultiSource/Benchmarks/Prolangs-C:
  - cdecl: The benchmark parses about 70 C declarations
* MultiSource/Benchmarks/MiBench:
  - office-stringsearch searches a few hundred substrings in a set of a few
hundred strings.
  - telecom-adpcm seems to spend most of its time in IO - and should have a
loop to do the main data
    transformation multiple times on the same buffer if run as a benchmark.
* SingleSource/Benchmarks/Stanford:
  - IntMM: does 10 matrix multiplications of size 40x40.
* SingleSource/Regression/C/Makefile: matrixTranspose,  sumarray2d,
test_indvars:
  - matrixTranspose: transposes a 32x32 matrix 10 times.
  - sumarray2d: creates a 100x100 matrix and sums all the elements in it.
  - test_indvars: traverses a 20000-element matrix twice.
* SingleSource/UnitTests/SignlessTypes:
  - rem: does 100 gcd computations.

The following programs also run very shortly, but I do think they have value
as a benchmark - so the attached patch doesn't disable them in benchmark
mode:
* SingleSource/Benchmarks/Misc/lowercase
* SingleSource/Benchmarks/Shootout/objinst
* SingleSource/Benchmarks/Shootout-C++/objinst
For all three, it seems that the main computation in the benchmark is
completely
optimized away. Keeping these running as benchmarks should allow us to catch
if
llvm ever regresses in being able to optimize away the main computation in
these
programs.

What do you think?

Thanks,

Kristof
-------------- next part --------------
A non-text attachment was scrubbed...
Name: exclude_short_running_programs_from_benchmark_mode.diff
Type: application/octet-stream
Size: 5307 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150519/b862ada5/attachment.obj>