[llvm-dev] RFC: Add bitcode tests to test-suite

Mon Feb 29 15:39:20 PST 2016

On Mon, Feb 29, 2016 at 2:06 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:

>
> On Feb 29, 2016, at 1:50 PM, Alina Sbirlea <alina.sbirlea at gmail.com>
> wrote:
>
>
>
> On Mon, Feb 29, 2016 at 12:18 PM, Mehdi Amini <mehdi.amini at apple.com>
> wrote:
>
>>
>> On Feb 29, 2016, at 11:40 AM, Mehdi Amini via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>
>> On Feb 29, 2016, at 11:16 AM, Alina Sbirlea via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>> All,
>>
>> To get the discussion going in a focused manner, here is an initial patch
>> with a running test. The test is from the Halide suite and is checking the
>> correctness of several simd operations.
>> (Notes: the patch is large due to the number of operations being tested;
>> I expect a lot of changes before actually landing it, this is simply to
>> continue the discussion using a concrete example.)
>> http://reviews.llvm.org/D17726
>>
>>
>> I can't figure how to download the patch *with the bitcode files* from
>> Phabricator. Can you push this on github (or somewhere else)? (or if I
>> missed how to proceed...).
>>
>>
>> I was able to figure how get them "one by one", it would still be more
>> convenient to have an archive or a repo to clone somewhere.
>>
>> A few questions/todos to start the discussion:
>> 1. What is a good location for these tests? They are in a separate
>> Bitcode directory atm, but using the llvm_multisource. This may change to
>> more closely model the approach for external tests (see next item).
>>
>>
>> A good location would be their own external repository IMO :)
>>
>> 2. There is a single .cpp file testing all operations provided by
>> individual bitcode files. I expect this to change. Instead of using
>> llvm_multisource to have the same test run with specific arguments, each
>> run testing a single operation.
>> 3. The building approach I took is to first link all bitcode files into a
>> single one, then obtain the assembly for it, which cmake knows to take as
>> an input source.
>>
>>
>> Yeah, so I'd rather have a split-build model, with a split execution
>> model. Having a gigantic bitcode file to debug an issue is not friendly.
>> I'd expect to have a .cpp file that contains the main and the logic to
>> run test, and then every test that is linked-in to be executed, a bit like
>> gtests is doing (there are multiple registering mechanisms that would avoid
>> to declare explicitly a test in the header).
>> -> filters.h and filter_headers.h should just go away.
>>
>
> I agree, this is related to point 2. The plan here is to update the
> current test .cpp file to test each operation individually. In this model
> it will be enough to link with a single bitcode file per test.
>
>
>>
>> Also on the test in general: we should have an idea for each test what it
>> is doing and how.
>> I was expecting your tests to be on the pattern of having an
>> implementation in C++ and an implementation in Halide bitcode of a filters
>> (or whatever) and run both on random data and verifies that the result is
>> matching.
>> Unfortunately from what I can see you are feeding the tests with random
>> data, and the tests are "blackboxes" that set an error flag if they detect
>> an error.
>> This is not super robust: the compiler can mess with the error checking
>> and eliminate it for instance, making any error undetected.
>>
>
> The Halide bitcode filters compare the result of vectorized operations vs
> scalar runs of the same code. The error code against which we compare the
> output will be set to loose tolerance - it is currently 0. We're interested
> in codegen bugs that return the wrong value entirely, not accuracy
> differences (especially for floating point tests).
> With the new error threshold, the data fed into may be random or read from
> provided input files, I can do either.
> The filters will still look somewhat like blackboxes, though the name of
> the filter says what operation it's  being tested and the disassembled
> bitcode files are reasonably readable.
> Using your suggestion, the driver .cpp file will test one operation at a
> time (argvs set accordingly) and return right away once an error is found.
> Sound about right?
>
>
> All of this is great.
> The part that is not clear to me is why isn't it to have (what does it buy
> us over, or why is it better for us compared to) a possible a C/C++
> reference implementation of the filter, and hoisting (and refactor) all the
> logic to feed the tests and validate the output *out* of the filters. A
> filter would be just the mathematical function performed and nothing more
> (separation of concerns, more robust framework, easier debugging when
> things-go-wrong, etc.).
>

I believe the answer is that Halide generates vectorized code in a way that
is not generated by llvm when starting from C/C++.
Having a C/C++ scalar reference would involve quite a bit of effort for all
tests. The primary reason Halide is being used is that you don't need to
write a lot of C/C++ code to get different optimizations for the same code
(e.g. vectorized vs scalar is a one line difference). So we can get really
fast test coverage for possible codegen bugs by comparing that different
layout optimizations in Halide give the same result.
I think having each filter tested separately should give a good separation
of concerns and easy debugging for each particular test.

>
> Also, just looking quickly at one IR I'm surprised by things like:
>>
>> "assert succeeded165":                            ; preds = %"assert
>> succeeded146"
>>   %buf_host181 = getelementptr inbounds %struct.buffer_t,
>> %struct.buffer_t* %error_op_pcmpeqq_272.buffer, i64 0, i32 1
>>   %23 = bitcast i8** %buf_host181 to double**
>>   %error_op_pcmpeqq_272.host226227232 = load double*, double** %23, align
>> 8
>>   %24 = icmp eq %struct.buffer_t* %error_op_pcmpeqq_272.buffer, null
>>   br i1 %24, label %"assert failed183", label %"assert succeeded184",
>> !prof !4
>>
>> Here you have as check for nullptr at %24, but you already
>> loaded %error_op_pcmpeqq_272.host226227232 from this pointer just before!
>>
>
> It's checking that the host value loaded from buffer_t is not null. I
> don't see what's wrong with this. What am I missing?
>
>
> I may be misreading it, my impression when skimming through the code was
> that it seems equivalent to:
>
> foo(buffer_t *out) {
>   auto value = out->host;
>   if (!out) {
>     error("nullptr");
>   }
> }
>
>
> In case I haven't been clear: I think this work is valuable for the
> project, and thank you for putting some effort into it :)
>
> --
> Mehdi
>
>
>
>
>
>
>
>
>
>>
>> A separate discussion is on reading metadata (mcpu and mattr) in llc. I
>> added a script to work around that for now.
>>
>>
>> The generic way of doing it in llvm is (I think) to use function
>> attributes:
>>
>> attributes #0 = { "target-cpu"="x86-64" "target-features"="+avx2" }
>>
>> You shouldn't need it on the command line I think?
>>
>
> Yes, I believe so too. Currently these are set in mcpu and mattr by Halide
> and not read in by llc, hence the need for feeding them as parameters. It's
> a separate issue that we'll need to go into in depth, but I don't want it
> to interfere with getting feedback on how to best publish these tests.
>
>
>>
>> --
>> Mehdi
>>
>>
>>
>>
>> Looking forward to your feedback!
>>
>> Thanks,
>> Alina
>>
>>
>>
>> On Fri, Feb 19, 2016 at 6:50 AM, Kristof Beyls <kristof.beyls at arm.com>
>> wrote:
>>
>>>
>>>
>>> On 18/02/2016 19:12, Alina Sbirlea via llvm-dev wrote:
>>>
>>>
>>>
>>>> I have more questions for Alina. What kind of tests do you have:
>>>>
>>>> - "the compiler takes the bitcode and generates code without crashing"
>>>> - "the compiled test runs without crashing"
>>>> - "the compiled test will produce an output that be checked against a
>>>> reference"
>>>> - "the compiled test is meaningful as a benchmarks"
>>>>
>>>
>>> We have all 4 kinds of tests in Halide. The bitcode files for the first
>>> category is already available and I'm working on building the ones for the
>>> next 3. We'd like to include all incrementally.
>>>
>>>
>>> It seems to me that the first category ("the compiler takes the bitcode
>>> and generates code without crashing") are tests that should be added to the
>>> "make check-all" tests in the LLVM subproject, rather than the test-suite
>>> subproject?
>>> Or if these tests currently don't crash the compiler anymore, the bugs
>>> must have been fixed, and there should already be equivalent tests?
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160229/bbe8bcee/attachment.html>