[PATCH] D34362: [LNT] Support for different DataSet usage in Polybench for "lnt runtest nt"

Wed Jun 28 07:23:41 PDT 2017

kristof.beyls added a comment.

In https://reviews.llvm.org/D34362#792614, @MatzeB wrote:

> In https://reviews.llvm.org/D34362#790215, @cs14mtech11017 wrote:
>
> > In https://reviews.llvm.org/D34362#789280, @cmatthews wrote:
> >
> > > It makes more sense to me to be using the --make-param flag to pass a test specific configuration options.  If you want to add all these size classes, all the tests should support them, or have them mapped back to nearest size the tests can handle.
> >
> >
> > I feel having a common flag for test data sizes is better than having to pass them as test specific flag. As you suggested, we can map them back to nearest data sizes for the other tests. I was working with Polybench, so did not look into other tests.  Will proceed with it if everyone agrees on adding a flag like "--testdatasizes=mini|small|medium|large|extra_large" and aliasing --small and --large for backward compatibility.
>
>
> Look at SPEC for example: It has 3 different sizes: test, train and ref. You can describe them roughly as:
>
> - test: "As fast as posisble, not useful as a benchmark but they touch enough code paths so that they can give you a quick way to test for correctness"
> - train: "Somewhat realistic but smaller inputs, useful to produce data for PGO optimizations. The important aspect here is that the data is different enough from ref, so we don't overfit the code because training and reference data were the same"
> - ref: "Larger inputs running for a longer time producing stable numbers".
>
>   So there is some semantics and intended uses here that is captured fine with "test", "train" and "ref". Mapping this to some generic terms like "small", "medium", "large" that just map to sizes would be a loss IMO.

Right, that's a fair point.

FWIW, "lnt runtest test-suite" doesn't have --small or --large or --spec-with-ref command line options; but it does have a --test-size option, seemingly with the same intention as the "--testdatasizes" command line option under discussion here.
That being said, that "--test-size" option currently doesn't seem to actually be doing anything...

I can see that in different circumstances, you'd either want the generic flag vs the SPEC-specific flags. For example:

- Assuming you want to run as many of the programs in the test-suite (with externals plugged in) as possible, you'd probably want to use the generic flag that maps to "small", "medium", "large" to control for how long the benchmark runs. Or how much memory it could end up using. Or maybe some other aspect of program scalability. I don't think I've seen small vs large defined anywhere particularly well...
- If you're only going to run the SPEC benchmarks, you'd probably want to be able to state to run "test", "train" or "ref", rather than "small", etc.
- Of course if you want to retain the original terms and meanings for data size used during benchmarking for SPEC, you can also argue you could want that for other benchmarks like Polybench.

So, in summary, I'm not sure what the conclusion here should be.
If we had a reasonable definition of what "small" or "large" meant, or intended to mean, that would make this decision easier.
For now, I think it's fine for --small and --large to map to traditional "small" and "large" semantics for the test-suite.
Maybe we should refrain from introducing further command line options in LNT for all possible benchmark-specific sizes and indeed set them using --cmake-define/--make-param, to avoid a further explosion of lnt runtest command line options?

Repository:
  rL LLVM

https://reviews.llvm.org/D34362