[llvm-dev] [cfe-dev] [Openmp-dev] RFC: End-to-end testing

Thu Oct 17 08:38:53 PDT 2019

On 10/17/19 10:00 AM, David Greene via cfe-dev wrote:
> Mehdi AMINI via llvm-dev <llvm-dev at lists.llvm.org> writes:
>
>> The main thing I see that will justify push-back on such test is the
>> maintenance: you need to convince everyone that every component in LLVM
>> must also maintain (update, fix, etc.) the tests that are in other
>> components (clang, flang, other future subproject, etc.). Changing the
>> vectorizer in the middle-end may require now to understand the kind of
>> update a test written in Fortran (or Haskell?) is checking with some
>> Hexagon assembly. This is a non-trivial burden when you compute the full
>> matrix of possible frontend and backends.

That's true, but at some point we really do just need to work together 
to make changes. If some necessary group of people become unresponsive, 
then we'll need to deal with that, but just not knowing whether the 
compiler works as intended seems worse.

> That's true.  But don't we want to make sure the complete compiler works
> as expected?  And don't we want to be alerted as soon as possible if
> something breaks?  To my knowledge we have very few end-to-end tests of
> the type I've been thinking about.  That worries me.

I agree. We really should have more end-to-end testing for cases where 
we have end-to-end contracts. If we provide a pragma to ask for 
vectorization, or loop unrolling, or whatever, then we should test "end 
to end" for whatever that means from the beginning of the contract 
(i.e., the place where the request is asserted) to the end (i.e., the 
place where to can confirm that the user will observe the intended 
behavior) - this might mean checking assembly or it might mean checking 
end-stage IR, etc. There are other cases where, even if there's no 
pragma, we know what the optimal output is and we can test for it. We've 
had plenty of cases where changes to the pass pipeline, instcombine, 
etc. have caused otherwise reasonably-well-covered components to stop 
behaving as expected in the context of the complete pipeline. 
Vectorization is a good example of this, but is not the only such 
example. As I recall, other loop optimizations (unrolling, idiom 
recognition, etc.) have also had these problems over time

>
>> Even if you write very small tests for checking vectorization, what is
>> next? What about unrolling, inlining, loop-fusion, etc. ? Why would we stop
>> the end-to-end FileCheck testing to vectorization?
> I actually think vectorization is probably lower on the concern list for
> end-to-end testing than more focused things like FMA generation,
> prefetching and so on.

In my experience, these are about equal. Vectorization being later means 
that fewer things can mess things up afterwards (although there still is 
all of codegen), but more things can mess things up beforehand.

  -Hal

>   This is because there isn't a lot after the
> vectorization pass that can be mess up vectorization.  Once something is
> vectorized, it is likely to stay vectorized.  On the other hand, I have
> for example frequently seen prefetches dropped or poorly scheduled by
> code long after the prefetch got inserted into the IR.
>
>> So the monorepo vs the test-suite seems like a false dichotomy: if such
>> tests don't make it in the monorepo it will be (I believe) because folks
>> won't want to maintain them. Putting them "elsewhere" is fine but it does
>> not solve the question of the maintenance of the tests.
> Agree 100%.
>
>                        -David
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory