[cfe-dev] test-suite: a new proposal for how to move forward to make "test-suite" more automatic, more flexible, and more maintainable, especially WRT reference outputs

Wed Oct 5 15:29:52 PDT 2016

Dear all,

Today I had an idea that might satisfy all the needs for improvement we currently have "on the 
plate" WRT the repo.-wise sizes of reference outputs and the issues surrounding FP 
optimizations and how to allow them while still allowing test programs in "test-suite" the 
output[s] of which depend upon FP computations [and for which relatively-small changes in FP 
accuracy, whether up/more-accurate or down/less-accurate, change the actual observed output].

For non-FP-dependent, fully-deterministic programs, we can choose the shortest [in # of bytes 
as reported by "ls"] of the following:

   * hash
   * compressed output
   * raw output

[in increasing order of "likely" size]

... or we can establish some minimum differentiating factors, e.g. "compressed output must be 
at least 2x smaller than raw output, otherwise stick to raw output" and "hash must be at least 
10x smaller than compressed output, otherwise stick to compressed output".  If needed/{strongly 
desired}, the rules can even be a little more complicated than that, e.g. "compressed output 
must be at least 2x smaller than raw output OR at least 4096 bytes smaller than raw output, 
otherwise stick to raw output".

For programs that _are_ either FP-dependent, not-fully-deterministic, or both, I propose that 
we shall only choose from the set {compressed output, raw output} because:

   1) small-enough variation in the result is expected, normal, and tolerated

and

   2) since this way the raw reference output will be available at the "lit"-running host 
[after decompression, if needed],
      the "fpcmp" program will be able to be told how much tolerance to allow for each run.

If we only choose from the set {compressed ref. output, raw ref. output} for these tests, then 
it should be relatively easy to run some tests with output-changing FP optimizations enabled, 
since those runs won`t depend on the {no-output-changing-FP-optimizations} build having run 
first.  Although Hal`s suggestion to have the {no-output-changing-FP-optimizations} build 
produce the output that will be analyzed by the {output-changing FP optimizations enabled} 
builds is an excellent suggestion, it seems that implementing it in the context of "lit" is a 
large amount more difficult than we had hoped for.  If anybody reading this knows how to make 
"lit" only start one test after another one has finished, please chime in.

If compressed ref. outputs will be accepted by the community, then please let me know which of 
the following would be acceptable to depend on the ability to decompress:

   bz2
   gzip
   xz

I`m perfectly willing to write [a] wrapper[s] that will probe the system for programs that can 
decompress whatever it is and will choose the best one.

Regards,

Abe