[llvm-dev] RFC: Add bitcode tests to test-suite

Mon Feb 29 23:28:33 PST 2016

I really don't understand this. Why is a C++ reference useful? Instead, we
have an IR reference: the unoptimized IR.

I think it pretty important to not phrase everything in terms of C or C++
or Clang... =/

What if there are IR constructs that simply cannot be produced by C++?
We're adding lots of those for managed languages.

If there is a practical concern with having IR-based test cases, I'd like
to actually discuss those head on rather than fishing for phrasing even
more of our testing in terms of a single language and frontend.

On Tue, Mar 1, 2016 at 2:16 AM Mehdi Amini via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> On Feb 29, 2016, at 10:50 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
>
>
> ------------------------------
>
> *From: *"Mehdi Amini via llvm-dev" <llvm-dev at lists.llvm.org>
> *To: *"Alina Sbirlea" <alina.sbirlea at gmail.com>
> *Cc: *"llvm-dev" <llvm-dev at lists.llvm.org>
> *Sent: *Monday, February 29, 2016 7:06:51 PM
> *Subject: *Re: [llvm-dev] RFC: Add bitcode tests to test-suite
>
>
>
> Sent from my iPhone
>
> On Feb 29, 2016, at 3:39 PM, Alina Sbirlea <alina.sbirlea at gmail.com>
> wrote:
>
>
>
> On Mon, Feb 29, 2016 at 2:06 PM, Mehdi Amini <mehdi.amini at apple.com>
> wrote:
>
>>
>> On Feb 29, 2016, at 1:50 PM, Alina Sbirlea <alina.sbirlea at gmail.com>
>> wrote:
>>
>>
>>
>> On Mon, Feb 29, 2016 at 12:18 PM, Mehdi Amini <mehdi.amini at apple.com>
>> wrote:
>>
>>>
>>> On Feb 29, 2016, at 11:40 AM, Mehdi Amini via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>
>>> On Feb 29, 2016, at 11:16 AM, Alina Sbirlea via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>> All,
>>>
>>> To get the discussion going in a focused manner, here is an initial
>>> patch with a running test. The test is from the Halide suite and is
>>> checking the correctness of several simd operations.
>>> (Notes: the patch is large due to the number of operations being tested;
>>> I expect a lot of changes before actually landing it, this is simply to
>>> continue the discussion using a concrete example.)
>>> http://reviews.llvm.org/D17726
>>>
>>>
>>> I can't figure how to download the patch *with the bitcode files* from
>>> Phabricator. Can you push this on github (or somewhere else)? (or if I
>>> missed how to proceed...).
>>>
>>>
>>> I was able to figure how get them "one by one", it would still be more
>>> convenient to have an archive or a repo to clone somewhere.
>>>
>>> A few questions/todos to start the discussion:
>>> 1. What is a good location for these tests? They are in a separate
>>> Bitcode directory atm, but using the llvm_multisource. This may change to
>>> more closely model the approach for external tests (see next item).
>>>
>>>
>>> A good location would be their own external repository IMO :)
>>>
>>> 2. There is a single .cpp file testing all operations provided by
>>> individual bitcode files. I expect this to change. Instead of using
>>> llvm_multisource to have the same test run with specific arguments, each
>>> run testing a single operation.
>>> 3. The building approach I took is to first link all bitcode files into
>>> a single one, then obtain the assembly for it, which cmake knows to take as
>>> an input source.
>>>
>>>
>>> Yeah, so I'd rather have a split-build model, with a split execution
>>> model. Having a gigantic bitcode file to debug an issue is not friendly.
>>> I'd expect to have a .cpp file that contains the main and the logic to
>>> run test, and then every test that is linked-in to be executed, a bit like
>>> gtests is doing (there are multiple registering mechanisms that would avoid
>>> to declare explicitly a test in the header).
>>> -> filters.h and filter_headers.h should just go away.
>>>
>>
>> I agree, this is related to point 2. The plan here is to update the
>> current test .cpp file to test each operation individually. In this model
>> it will be enough to link with a single bitcode file per test.
>>
>>
>>>
>>> Also on the test in general: we should have an idea for each test what
>>> it is doing and how.
>>> I was expecting your tests to be on the pattern of having an
>>> implementation in C++ and an implementation in Halide bitcode of a filters
>>> (or whatever) and run both on random data and verifies that the result is
>>> matching.
>>> Unfortunately from what I can see you are feeding the tests with random
>>> data, and the tests are "blackboxes" that set an error flag if they detect
>>> an error.
>>> This is not super robust: the compiler can mess with the error checking
>>> and eliminate it for instance, making any error undetected.
>>>
>>
>> The Halide bitcode filters compare the result of vectorized operations vs
>> scalar runs of the same code. The error code against which we compare the
>> output will be set to loose tolerance - it is currently 0. We're interested
>> in codegen bugs that return the wrong value entirely, not accuracy
>> differences (especially for floating point tests).
>> With the new error threshold, the data fed into may be random or read
>> from provided input files, I can do either.
>> The filters will still look somewhat like blackboxes, though the name of
>> the filter says what operation it's  being tested and the disassembled
>> bitcode files are reasonably readable.
>> Using your suggestion, the driver .cpp file will test one operation at a
>> time (argvs set accordingly) and return right away once an error is found.
>> Sound about right?
>>
>>
>> All of this is great.
>> The part that is not clear to me is why isn't it to have (what does it
>> buy us over, or why is it better for us compared to) a possible a C/C++
>> reference implementation of the filter, and hoisting (and refactor) all the
>> logic to feed the tests and validate the output *out* of the filters. A
>> filter would be just the mathematical function performed and nothing more
>> (separation of concerns, more robust framework, easier debugging when
>> things-go-wrong, etc.).
>>
>
> I believe the answer is that Halide generates vectorized code in a way
> that is not generated by llvm when starting from C/C++.
>
>
> I don't really see how *this* addresses my point.  This is justifying why
> your bitcode is interesting and why we are having this conversation at all
> :)
> It does not say why we can't have a scalar naive C/C++ impl along with the
> bitcode for filter.
>
>
> Having a C/C++ scalar reference would involve quite a bit of effort for
> all tests. The primary reason Halide is being used is that you don't need
> to write a lot of C/C++ code to get different optimizations for the same
> code (e.g. vectorized vs scalar is a one line difference).
>
>
> Yes, this is what is nice with Halide: "write once, codegen multiple
> variant". But it does not mean you can't write a c++ reference for every
> Halide filter  (not for every codegen variant of a filter!)
>
> It's been 2 years since my last experiments with Halide, but my memories
> were that there was a C backend?
> I had in mind for each test to have (possibly in a separate directory for
> each test):
>  - the halide source for the filter.
>  - the c/c++ (maybe generated?) for the filter.
>  - the bitcode generated for the filter (potentially multiple variant
> depending on the CPU support and/or the schedule).
>
> Then some common code/infrastructure to interact with the individual
> filters, loading them, executing the variants for a filter, and checking
> results.
>
> If the reference c/c++ can't be generated by halide (or obtained somehow),
> and we can't do better than the current tests infrastructure you have, then
> I'm worried about the cost/benefit for this test-suite.
>
>
> I think that a C/C++ version would be nice to have, but not necessary.
>
> IR generated by non-Clang frontends and/or IR going through alternate
> optimization pipelines tend to hit bugs that are much harder to hit with
> Clang alone.
>
>
> To make it clear: the point of the C++ reference I was asking for is *not
> to stress clang* at all. It is intended to compute the *golden result* to
> be compared to the runs of each variant for a filter.  Having a reference
> is important when a diff is detected in the output and you need to figure
> out what is going on.
>
>
> It would help to have a description of what each test does, but including,
> for example, the Halide source code for each test will hopefully be enough
> of a guide.
>
>
> So we can get really fast test coverage for possible codegen bugs by
> comparing that different layout optimizations in Halide give the same
> result.
> I think having each filter tested separately should give a good separation
> of concerns and easy debugging for each particular test.
>
>
> This is great for halide validation, we are all agreeing with this I
> think. The question is where is the tradeoff for the LLVM project. I'm
> trying to make sure that the extra coverage doesn't come with a burden to
> debug and triage issues when something will break: i.e. the tests need to
> be very friendly to interact with.
>
>
> I don't think that any non-trivial tests are truly "friendly" to interact
> with.
>
>
> Yeah, I just think we shouldn't make it arbitrarily worse by not having a
> good infrastructure to begin with :)
>
> In the end, if we expect this suite to be accepted and maintained as a
> "first-class citizen" by LLVM developers (i.e. accepting things like
> reverting a commit if it breaks something in this suite), we'd better make
> sure the burden to interact with it is minimal.
>
>
> Tracking down self-hosting bugs is not friendly, and those aren't anywhere
> near the worst ;) --
>
> These tests with their simple driver seem like good input that bugpoint
> can reduce (assuming the tests runtimes are not too long), and that's
> friendlier than most of the other multisource tests.
>
>
> If we're talking exclusively about crashes, I agree. However if we're
> considering correctness issue as well (miscompiles), I believe that the
> structural changes I proposed are very important to easily perform bugpoint
> on them for instance (or bugpoint would just turn the test into "return
> false;").
> Also I believe these changes are necessary to perform timing measurement
> for these tests, if we are interested in the quality of the
> optimization/codegen (to be hooked into something like
> https://github.com/google/benchmark ?).
>
> Best,
>
> --
> Mehdi
>
>
>
>
>  -Hal
>
> This is the motivation for my comments so far.
> Other people in the community may have a different opinion/appreciation of
> the situation, this just represents my current thoughts.
>
> Hope it makes sense.
>
> --
> Mehdi
>
>
>
>
>
>>
>> Also, just looking quickly at one IR I'm surprised by things like:
>>>
>>> "assert succeeded165":                            ; preds = %"assert
>>> succeeded146"
>>>   %buf_host181 = getelementptr inbounds %struct.buffer_t,
>>> %struct.buffer_t* %error_op_pcmpeqq_272.buffer, i64 0, i32 1
>>>   %23 = bitcast i8** %buf_host181 to double**
>>>   %error_op_pcmpeqq_272.host226227232 = load double*, double** %23,
>>> align 8
>>>   %24 = icmp eq %struct.buffer_t* %error_op_pcmpeqq_272.buffer, null
>>>   br i1 %24, label %"assert failed183", label %"assert succeeded184",
>>> !prof !4
>>>
>>> Here you have as check for nullptr at %24, but you already
>>> loaded %error_op_pcmpeqq_272.host226227232 from this pointer just before!
>>>
>>
>> It's checking that the host value loaded from buffer_t is not null. I
>> don't see what's wrong with this. What am I missing?
>>
>>
>> I may be misreading it, my impression when skimming through the code was
>> that it seems equivalent to:
>>
>> foo(buffer_t *out) {
>>   auto value = out->host;
>>   if (!out) {
>>     error("nullptr");
>>   }
>> }
>>
>>
>> In case I haven't been clear: I think this work is valuable for the
>> project, and thank you for putting some effort into it :)
>>
>> --
>> Mehdi
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>>
>>> A separate discussion is on reading metadata (mcpu and mattr) in llc. I
>>> added a script to work around that for now.
>>>
>>>
>>> The generic way of doing it in llvm is (I think) to use function
>>> attributes:
>>>
>>> attributes #0 = { "target-cpu"="x86-64" "target-features"="+avx2" }
>>>
>>> You shouldn't need it on the command line I think?
>>>
>>
>> Yes, I believe so too. Currently these are set in mcpu and mattr by
>> Halide and not read in by llc, hence the need for feeding them as
>> parameters. It's a separate issue that we'll need to go into in depth, but
>> I don't want it to interfere with getting feedback on how to best publish
>> these tests.
>>
>>
>>>
>>> --
>>> Mehdi
>>>
>>>
>>>
>>>
>>> Looking forward to your feedback!
>>>
>>> Thanks,
>>> Alina
>>>
>>>
>>>
>>> On Fri, Feb 19, 2016 at 6:50 AM, Kristof Beyls <kristof.beyls at arm.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On 18/02/2016 19:12, Alina Sbirlea via llvm-dev wrote:
>>>>
>>>>
>>>>
>>>>> I have more questions for Alina. What kind of tests do you have:
>>>>>
>>>>> - "the compiler takes the bitcode and generates code without crashing"
>>>>> - "the compiled test runs without crashing"
>>>>> - "the compiled test will produce an output that be checked against a
>>>>> reference"
>>>>> - "the compiled test is meaningful as a benchmarks"
>>>>>
>>>>
>>>> We have all 4 kinds of tests in Halide. The bitcode files for the first
>>>> category is already available and I'm working on building the ones for the
>>>> next 3. We'd like to include all incrementally.
>>>>
>>>>
>>>> It seems to me that the first category ("the compiler takes the bitcode
>>>> and generates code without crashing") are tests that should be added to the
>>>> "make check-all" tests in the LLVM subproject, rather than the test-suite
>>>> subproject?
>>>> Or if these tests currently don't crash the compiler anymore, the bugs
>>>> must have been fixed, and there should already be equivalent tests?
>>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160301/c6a6150f/attachment.html>