[test-suite] r261857 - [cmake] Add support for arbitrary metrics

Thu Mar 24 00:47:21 PDT 2016

Just wanted to add my 2 cents on splitting out benchmarks into smaller
entities/binaries.
With the recording of the hash of a binary as currently implemented,
splitting out each
kernel into a separate binary would allow to very quickly detect if a
compiler change caused
a code generation change in an individual kernel.
I can't immediately think of the best way to compute an "internal" hash
for the code generated for
individual kernels. Maybe if we compiled with -ffunction-sections, put
each kernel in a
separate function and found an easy way to compute the hash for each
section? I'm not sure if
we would be miss hashing performance-affecting bytes if we only hashed the
section the function was
in. This all assumes that all code is fully inlined into the function of
course.

Thanks,

Kristof

On 23/03/2016 23:19, "llvm-commits on behalf of Hal Finkel via
llvm-commits" <llvm-commits-bounces at lists.llvm.org on behalf of
llvm-commits at lists.llvm.org> wrote:

>----- Original Message -----
>> From: "Matthias Braun" <mbraun at apple.com>
>> To: "Hal Finkel" <hfinkel at anl.gov>
>> Cc: "James Molloy" <James.Molloy at arm.com>, "nd" <nd at arm.com>,
>>"llvm-commits" <llvm-commits at lists.llvm.org>
>> Sent: Friday, March 4, 2016 12:36:36 PM
>> Subject: Re: [test-suite] r261857 - [cmake] Add support for arbitrary
>>metrics
>> 
>> A test can report "internal" metrics now. Though I don't think lnt
>> would split those into a notion of sub-tests I think.
>> It would be an interesting feature to add. Though if we have the
>> choice to modify a benchmark, we should still prefer smaller
>> independent ones IMO as that gives a better idea when some of the
>> other metrics change (compiletime, codesize, hopefully things like
>> memory usage or performance counters in the future).
>
>Unless the kernels are large, their code size within the context of a
>complete executable might be hard to track regardless (because by the
>time you add in the static libc startup code, ELF headers, etc. any
>change would be a smaller percentage of the total). Explicitly
>instrumenting the code to mark regions of interest is probably best
>(which is true for timing too), but that seems like a separate (although
>worthwhile) project.
>
>In any case, for TSVC, for example, the single test has 136 kernels;
>which I currently group into 18 binaries. I have a float and double
>version for each, so we have 36 total binaries. What you're suggesting
>would have us produce 272 separate executables, just for TSVC. Ideally,
>I'd like aligned and unaligned variants of each of these. I've not done
>that because I thought that 72 executables would be a bit much, but
>that's 544 executables if I generate one per kernel variant.
>
>The LCALS benchmark, which I'd really like to add sometime soon, has
>another ~100 kernels, which is ~200 to do both float and double (which we
>should do).
>
>What do you think is reasonable here?
>
>Thanks again,
>Hal