r216249 - [test/CodeGen/ARM] Adpat test to match new codegen after r216236.

Mon Aug 25 16:43:59 PDT 2014

Thanks Bob for the context.

On Aug 25, 2014, at 4:05 PM, Bob Wilson <bob.wilson at apple.com> wrote:

> + Jim Grosbach
> 
>> On Aug 25, 2014, at 10:45 AM, Quentin Colombet <qcolombet at apple.com> wrote:
>> 
>> Thanks Renato.
>> 
>> I’ll wait for Bob’s inputs but I think, like you, removing the optimization level is the right thing to do.
>> 
>> -Quentin
> 
> This is kind of an unusual test that doesn’t fit the usual mold. I’m hesitant to defend it, except to say that so far, it is better than the alternatives.
> 
> Jim and I have debated this at some length, and Jim agrees with both of you that the “right” thing to do is to have a front-end test that checks the IR output of the front-end, with separate tests in the backend to check that the IR is translated to the expected assembly.
> 
> We don’t have that.
> 
> We have backend tests for some Neon code-gen and perhaps a few front-end tests for intrinsics as well, but there are a huge number of Neon intrinsics and it is important to have exhaustive tests for all of them. That’s what this test is for. It is machine-generated from clang’s arm_neon.td file. Obviously, if someone introduces a bug in the .td file, we would not catch it with this test. To avoid that, we have checked-in this pre-generated test, which we have manually checked against ARM’s documentation of the Neon intrinsics. I had requested an automatic test to regenerate it and check for differences, but I’m not sure that has gotten implemented yet.
> 
> So, anyway, that is the context. The reason it is compiled with optimization is that the auto-generated checks need to look for certain opcodes and without optimization, you won’t get many of them.

What do you recommend for the test cases where we are able to optimize the sequence of instructions? (See my examples form a previous mail)

Since we have to run with the optimizations enabled, is it worth reworking the test cases to use the values produced by the copy-related intrinsics?
That way the moves wouldn’t be coalesced.

Thanks,
-Quentin 

> 
>> On Aug 25, 2014, at 10:31 AM, Renato Golin <renato.golin at linaro.org> wrote:
>> 
>>> On 25 August 2014 17:57, Quentin Colombet <qcolombet at apple.com> wrote:
>>>> The problem is that the intrinsics at stake are just fancy moves, that can
>>>> be coalesced. I do not think there is a way to prevent the optimization to
>>>> happen other than disabling the optimization.
>>> 
>>> Well, those intrinsics are useful for a few things, normally a
>>> sequence of neon calls to which the move is not clear and needs to be
>>> explicit. Although this case was obvious and "easy" to generate, it's
>>> normally the first optimizations you end up doing, so not stable
>>> enough.
>>> 
>>> 
>>>> I can add the flag to do that, but I guess that wouldn’t be the right fix,
>>>> since we could have another backend that LLVM.
>>>> If we do want to check the lowering of intrinsics, shouldn’t we drop Os from
>>>> the run command?
>>> 
>>> Yes! We want to test if the front-end is generating the correct
>>> intrinsic, so any level of optimization will hide whatever we were
>>> trying to test in the first place. I risk to say that whomever removed
>>> the other tests in this file also did wrong, and we need to fix this.
>>> 
>>> I'm copying Bob since he was the one writing the NEON intrinsics in
>>> the first place, maybe he can see things I fail to. But AFAICS,
>>> disabling *any* optimization and re-checking for the correct
>>> instructions is the way to go.
>>> 
>>> cheers,
>>> --renato
>> 
>