[llvm-dev] RFC: Add bitcode tests to test-suite

Thu Mar 10 14:32:13 PST 2016

Patch updated (http://reviews.llvm.org/D17726).
Suggestions on how to improve this further are welcome!

(As before, due to the size of the patch, here's the alternative github
version: https://github.com/alinas/Bitcode)

Thank you,
Alina

On Tue, Mar 1, 2016 at 10:54 AM, Alina Sbirlea <alina.sbirlea at gmail.com>
wrote:

> Comments noted! Thank you for the feedback, this is great!
> I will integrate as many of your suggestions as possible in the updated
> patch and continue the discussion from there.
>
>
> On Tue, Mar 1, 2016 at 12:45 AM, Mehdi Amini <mehdi.amini at apple.com>
> wrote:
>
>>
>> On Feb 29, 2016, at 11:55 PM, Mehdi Amini via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>
>> On Feb 29, 2016, at 11:28 PM, Chandler Carruth <chandlerc at google.com>
>> wrote:
>>
>> I really don't understand this. Why is a C++ reference useful? Instead,
>> we have an IR reference: the unoptimized IR.
>>
>>
>> Mmmmm, thinking more about it, yes we could compile the IR two times and
>> use the unoptimized one as a reference.
>> However, right now the tests are only running llc, so the optimization
>> level is not that impactful and it is not clear to me that a miscompile
>> would be detected this way.
>>
>> Which lead me to a separate question for Alina: is this IR "unoptimized"?
>> What kind of optimization pipeline is used by Halide? Is the plan to only
>> exercise the CodeGen?
>>
>>
>> Answer to myself:
>> https://github.com/halide/Halide/blob/master/src/CodeGen_LLVM.cpp#L836
>> This is the "regular" O3 pipeline, with the addition of the
>> "AlwaysInlinerPass" at the beginning, and the "Function Passes" re-ran
>> again in a "custom way" at the end of the pipe (??).
>>
>> I feel it would increase our coverage to have the tests committed being
>> the unoptimized IR out of the Halide CodeGen, and running the optimization
>> pipeline over it.
>>
>> --
>> Mehdi
>>
>>
>>
>>
>> I think it pretty important to not phrase everything in terms of C or C++
>> or Clang... =/
>>
>> What if there are IR constructs that simply cannot be produced by C++?
>> We're adding lots of those for managed languages.
>>
>>
>> Having a first-class test-suite for IR with a managed runtime is probably
>> a whole new discussion :)
>>
>> If there is a practical concern with having IR-based test cases, I'd like
>> to actually discuss those head on rather than fishing for phrasing even
>> more of our testing in terms of a single language and front-end.
>>
>>
>> I think we're not talking about the same thing, because I feel that
>> "phrasing the test in term of a single language" is misrepresenting my
>> point about having a reference in C++. I mentioned in a previous email that
>> I'm all for having the Halide source file along the IR to "phrase the test".
>> The point about the reference is "how to validate the output of running
>> the IR test"?
>> I'm fine with alternate design, like having the reference computed in
>> Python or Julia (or ...), but I assumed C++ would seem just more natural to
>> everyone and integrates with the test infrastructure (written in C++).
>> I'm also fine with having a textual reference output file instead of a
>> C++ code that computes it (I believe this is what we are doing for the
>> C/C++ tests in the test-suite), but we are losing the current scheme to
>> have a random input for these tests.
>> (by the way Alina: the seed needs to be kept and printed on errors for
>> reproducibility, also a command line option would be nice to pass it).
>>
>> --
>> Mehdi
>>
>>
>>
>>
>> On Tue, Mar 1, 2016 at 2:16 AM Mehdi Amini via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> On Feb 29, 2016, at 10:50 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>>
>>>
>>>
>>> ------------------------------
>>>
>>> *From: *"Mehdi Amini via llvm-dev" <llvm-dev at lists.llvm.org>
>>> *To: *"Alina Sbirlea" <alina.sbirlea at gmail.com>
>>> *Cc: *"llvm-dev" <llvm-dev at lists.llvm.org>
>>> *Sent: *Monday, February 29, 2016 7:06:51 PM
>>> *Subject: *Re: [llvm-dev] RFC: Add bitcode tests to test-suite
>>>
>>>
>>>
>>> Sent from my iPhone
>>>
>>> On Feb 29, 2016, at 3:39 PM, Alina Sbirlea <alina.sbirlea at gmail.com>
>>> wrote:
>>>
>>>
>>>
>>> On Mon, Feb 29, 2016 at 2:06 PM, Mehdi Amini <mehdi.amini at apple.com>
>>> wrote:
>>>
>>>>
>>>> On Feb 29, 2016, at 1:50 PM, Alina Sbirlea <alina.sbirlea at gmail.com>
>>>> wrote:
>>>>
>>>>
>>>>
>>>> On Mon, Feb 29, 2016 at 12:18 PM, Mehdi Amini <mehdi.amini at apple.com>
>>>> wrote:
>>>>
>>>>>
>>>>> On Feb 29, 2016, at 11:40 AM, Mehdi Amini via llvm-dev <
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>>
>>>>> On Feb 29, 2016, at 11:16 AM, Alina Sbirlea via llvm-dev <
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>> All,
>>>>>
>>>>> To get the discussion going in a focused manner, here is an initial
>>>>> patch with a running test. The test is from the Halide suite and is
>>>>> checking the correctness of several simd operations.
>>>>> (Notes: the patch is large due to the number of operations being
>>>>> tested;
>>>>> I expect a lot of changes before actually landing it, this is simply
>>>>> to continue the discussion using a concrete example.)
>>>>> http://reviews.llvm.org/D17726
>>>>>
>>>>>
>>>>> I can't figure how to download the patch *with the bitcode files* from
>>>>> Phabricator. Can you push this on github (or somewhere else)? (or if I
>>>>> missed how to proceed...).
>>>>>
>>>>>
>>>>> I was able to figure how get them "one by one", it would still be more
>>>>> convenient to have an archive or a repo to clone somewhere.
>>>>>
>>>>> A few questions/todos to start the discussion:
>>>>> 1. What is a good location for these tests? They are in a separate
>>>>> Bitcode directory atm, but using the llvm_multisource. This may change to
>>>>> more closely model the approach for external tests (see next item).
>>>>>
>>>>>
>>>>> A good location would be their own external repository IMO :)
>>>>>
>>>>> 2. There is a single .cpp file testing all operations provided by
>>>>> individual bitcode files. I expect this to change. Instead of using
>>>>> llvm_multisource to have the same test run with specific arguments, each
>>>>> run testing a single operation.
>>>>> 3. The building approach I took is to first link all bitcode files
>>>>> into a single one, then obtain the assembly for it, which cmake knows to
>>>>> take as an input source.
>>>>>
>>>>>
>>>>> Yeah, so I'd rather have a split-build model, with a split execution
>>>>> model. Having a gigantic bitcode file to debug an issue is not friendly.
>>>>> I'd expect to have a .cpp file that contains the main and the logic to
>>>>> run test, and then every test that is linked-in to be executed, a bit like
>>>>> gtests is doing (there are multiple registering mechanisms that would avoid
>>>>> to declare explicitly a test in the header).
>>>>> -> filters.h and filter_headers.h should just go away.
>>>>>
>>>>
>>>> I agree, this is related to point 2. The plan here is to update the
>>>> current test .cpp file to test each operation individually. In this model
>>>> it will be enough to link with a single bitcode file per test.
>>>>
>>>>
>>>>>
>>>>> Also on the test in general: we should have an idea for each test what
>>>>> it is doing and how.
>>>>> I was expecting your tests to be on the pattern of having an
>>>>> implementation in C++ and an implementation in Halide bitcode of a filters
>>>>> (or whatever) and run both on random data and verifies that the result is
>>>>> matching.
>>>>> Unfortunately from what I can see you are feeding the tests with
>>>>> random data, and the tests are "blackboxes" that set an error flag if they
>>>>> detect an error.
>>>>> This is not super robust: the compiler can mess with the error
>>>>> checking and eliminate it for instance, making any error undetected.
>>>>>
>>>>
>>>> The Halide bitcode filters compare the result of vectorized operations
>>>> vs scalar runs of the same code. The error code against which we compare
>>>> the output will be set to loose tolerance - it is currently 0. We're
>>>> interested in codegen bugs that return the wrong value entirely, not
>>>> accuracy differences (especially for floating point tests).
>>>> With the new error threshold, the data fed into may be random or read
>>>> from provided input files, I can do either.
>>>> The filters will still look somewhat like blackboxes, though the name
>>>> of the filter says what operation it's  being tested and the disassembled
>>>> bitcode files are reasonably readable.
>>>> Using your suggestion, the driver .cpp file will test one operation at
>>>> a time (argvs set accordingly) and return right away once an error is
>>>> found. Sound about right?
>>>>
>>>>
>>>> All of this is great.
>>>> The part that is not clear to me is why isn't it to have (what does it
>>>> buy us over, or why is it better for us compared to) a possible a C/C++
>>>> reference implementation of the filter, and hoisting (and refactor) all the
>>>> logic to feed the tests and validate the output *out* of the filters. A
>>>> filter would be just the mathematical function performed and nothing more
>>>> (separation of concerns, more robust framework, easier debugging when
>>>> things-go-wrong, etc.).
>>>>
>>>
>>> I believe the answer is that Halide generates vectorized code in a way
>>> that is not generated by llvm when starting from C/C++.
>>>
>>>
>>> I don't really see how *this* addresses my point.  This is justifying
>>> why your bitcode is interesting and why we are having this conversation at
>>> all :)
>>> It does not say why we can't have a scalar naive C/C++ impl along with
>>> the bitcode for filter.
>>>
>>>
>>> Having a C/C++ scalar reference would involve quite a bit of effort for
>>> all tests. The primary reason Halide is being used is that you don't need
>>> to write a lot of C/C++ code to get different optimizations for the same
>>> code (e.g. vectorized vs scalar is a one line difference).
>>>
>>>
>>> Yes, this is what is nice with Halide: "write once, codegen multiple
>>> variant". But it does not mean you can't write a c++ reference for every
>>> Halide filter  (not for every codegen variant of a filter!)
>>>
>>> It's been 2 years since my last experiments with Halide, but my memories
>>> were that there was a C backend?
>>> I had in mind for each test to have (possibly in a separate directory
>>> for each test):
>>>  - the halide source for the filter.
>>>  - the c/c++ (maybe generated?) for the filter.
>>>  - the bitcode generated for the filter (potentially multiple variant
>>> depending on the CPU support and/or the schedule).
>>>
>>> Then some common code/infrastructure to interact with the individual
>>> filters, loading them, executing the variants for a filter, and checking
>>> results.
>>>
>>> If the reference c/c++ can't be generated by halide (or obtained
>>> somehow), and we can't do better than the current tests infrastructure you
>>> have, then I'm worried about the cost/benefit for this test-suite.
>>>
>>>
>>> I think that a C/C++ version would be nice to have, but not necessary.
>>>
>>> IR generated by non-Clang frontends and/or IR going through alternate
>>> optimization pipelines tend to hit bugs that are much harder to hit with
>>> Clang alone.
>>>
>>>
>>> To make it clear: the point of the C++ reference I was asking for is
>>> *not to stress clang* at all. It is intended to compute the *golden result*
>>> to be compared to the runs of each variant for a filter.  Having a
>>> reference is important when a diff is detected in the output and you need
>>> to figure out what is going on.
>>>
>>>
>>> It would help to have a description of what each test does, but
>>> including, for example, the Halide source code for each test will hopefully
>>> be enough of a guide.
>>>
>>>
>>> So we can get really fast test coverage for possible codegen bugs by
>>> comparing that different layout optimizations in Halide give the same
>>> result.
>>> I think having each filter tested separately should give a good
>>> separation of concerns and easy debugging for each particular test.
>>>
>>>
>>> This is great for halide validation, we are all agreeing with this I
>>> think. The question is where is the tradeoff for the LLVM project. I'm
>>> trying to make sure that the extra coverage doesn't come with a burden to
>>> debug and triage issues when something will break: i.e. the tests need to
>>> be very friendly to interact with.
>>>
>>>
>>> I don't think that any non-trivial tests are truly "friendly" to
>>> interact with.
>>>
>>>
>>> Yeah, I just think we shouldn't make it arbitrarily worse by not having
>>> a good infrastructure to begin with :)
>>>
>>> In the end, if we expect this suite to be accepted and maintained as a
>>> "first-class citizen" by LLVM developers (i.e. accepting things like
>>> reverting a commit if it breaks something in this suite), we'd better make
>>> sure the burden to interact with it is minimal.
>>>
>>>
>>> Tracking down self-hosting bugs is not friendly, and those aren't
>>> anywhere near the worst ;) --
>>>
>>> These tests with their simple driver seem like good input that bugpoint
>>> can reduce (assuming the tests runtimes are not too long), and that's
>>> friendlier than most of the other multisource tests.
>>>
>>>
>>> If we're talking exclusively about crashes, I agree. However if we're
>>> considering correctness issue as well (miscompiles), I believe that the
>>> structural changes I proposed are very important to easily perform bugpoint
>>> on them for instance (or bugpoint would just turn the test into "return
>>> false;").
>>> Also I believe these changes are necessary to perform timing measurement
>>> for these tests, if we are interested in the quality of the
>>> optimization/codegen (to be hooked into something like
>>> https://github.com/google/benchmark ?).
>>>
>>> Best,
>>>
>>> --
>>> Mehdi
>>>
>>>
>>>
>>>
>>>  -Hal
>>>
>>> This is the motivation for my comments so far.
>>> Other people in the community may have a different opinion/appreciation
>>> of the situation, this just represents my current thoughts.
>>>
>>> Hope it makes sense.
>>>
>>> --
>>> Mehdi
>>>
>>>
>>>
>>>
>>>
>>>>
>>>> Also, just looking quickly at one IR I'm surprised by things like:
>>>>>
>>>>> "assert succeeded165":                            ; preds = %"assert
>>>>> succeeded146"
>>>>>   %buf_host181 = getelementptr inbounds %struct.buffer_t,
>>>>> %struct.buffer_t* %error_op_pcmpeqq_272.buffer, i64 0, i32 1
>>>>>   %23 = bitcast i8** %buf_host181 to double**
>>>>>   %error_op_pcmpeqq_272.host226227232 = load double*, double** %23,
>>>>> align 8
>>>>>   %24 = icmp eq %struct.buffer_t* %error_op_pcmpeqq_272.buffer, null
>>>>>   br i1 %24, label %"assert failed183", label %"assert succeeded184",
>>>>> !prof !4
>>>>>
>>>>> Here you have as check for nullptr at %24, but you already
>>>>> loaded %error_op_pcmpeqq_272.host226227232 from this pointer just before!
>>>>>
>>>>
>>>> It's checking that the host value loaded from buffer_t is not null. I
>>>> don't see what's wrong with this. What am I missing?
>>>>
>>>>
>>>> I may be misreading it, my impression when skimming through the code
>>>> was that it seems equivalent to:
>>>>
>>>> foo(buffer_t *out) {
>>>>   auto value = out->host;
>>>>   if (!out) {
>>>>     error("nullptr");
>>>>   }
>>>> }
>>>>
>>>>
>>>> In case I haven't been clear: I think this work is valuable for the
>>>> project, and thank you for putting some effort into it :)
>>>>
>>>> --
>>>> Mehdi
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> A separate discussion is on reading metadata (mcpu and mattr) in llc.
>>>>> I added a script to work around that for now.
>>>>>
>>>>>
>>>>> The generic way of doing it in llvm is (I think) to use function
>>>>> attributes:
>>>>>
>>>>> attributes #0 = { "target-cpu"="x86-64" "target-features"="+avx2" }
>>>>>
>>>>> You shouldn't need it on the command line I think?
>>>>>
>>>>
>>>> Yes, I believe so too. Currently these are set in mcpu and mattr by
>>>> Halide and not read in by llc, hence the need for feeding them as
>>>> parameters. It's a separate issue that we'll need to go into in depth, but
>>>> I don't want it to interfere with getting feedback on how to best publish
>>>> these tests.
>>>>
>>>>
>>>>>
>>>>> --
>>>>> Mehdi
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Looking forward to your feedback!
>>>>>
>>>>> Thanks,
>>>>> Alina
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Feb 19, 2016 at 6:50 AM, Kristof Beyls <kristof.beyls at arm.com>
>>>>>  wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On 18/02/2016 19:12, Alina Sbirlea via llvm-dev wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> I have more questions for Alina. What kind of tests do you have:
>>>>>>>
>>>>>>> - "the compiler takes the bitcode and generates code without
>>>>>>> crashing"
>>>>>>> - "the compiled test runs without crashing"
>>>>>>> - "the compiled test will produce an output that be checked against
>>>>>>> a reference"
>>>>>>> - "the compiled test is meaningful as a benchmarks"
>>>>>>>
>>>>>>
>>>>>> We have all 4 kinds of tests in Halide. The bitcode files for the
>>>>>> first category is already available and I'm working on building the ones
>>>>>> for the next 3. We'd like to include all incrementally.
>>>>>>
>>>>>>
>>>>>> It seems to me that the first category ("the compiler takes the
>>>>>> bitcode and generates code without crashing") are tests that should be
>>>>>> added to the "make check-all" tests in the LLVM subproject, rather than the
>>>>>> test-suite subproject?
>>>>>> Or if these tests currently don't crash the compiler anymore, the
>>>>>> bugs must have been fixed, and there should already be equivalent tests?
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>>
>>> --
>>> Hal Finkel
>>> Assistant Computational Scientist
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160310/e65159e8/attachment-0001.html>