[PATCH] D34161: [AArch64] Add ARMv8.2-A FP16 vector intrinsics - Continuation

Thu Jun 15 12:10:39 PDT 2017

az added a comment.

In https://reviews.llvm.org/D34161#781106, @SjoerdMeijer wrote:

> Not sure if I am missing something, but the new regression tests ##arm-v8.2a-neon-intrinsics.c## and ##aarch64-neon-intrinsics.c## are failing for me (after first applying patch https://reviews.llvm.org/D32511 and then this one here in https://reviews.llvm.org/D34161).  For example, the very first test in ##arm-v8.2a-neon-intrinsics.c##  checks that the function argument is ##<4 x half> %a##, but ##<4 x half> %1## is generated. With anything other than ##-O0## you will get the ##%a## (and then it becomes a tailcall).

I renamed ##arm-v8.2a-neon-intrinsics.c## to ##aarch64-v8.2a-neon-intrinsics.c## and modified the +fp16 flag. Could it be that you are still running the old file from the old patch with the old flag? If that is not the case, then I will add this:

For the newly added file ##aarch64-v8.2a-neon-intrinsics.c##, It passed for me in this patch and in the original one (https://reviews.llvm.org/D32511). However, it failed for other people (in the same way as you are describing: %1 vs. %a) in the original patch and the changes were reverted. That lead me to execute this test individually in https://reviews.llvm.org/D32511. It gave me a warning about the +fp16 flag and I replaced it with the correct +fullfp16 flag after consulting with you about it. I was hoping that fixing the flag and the warning will make the test pass for all. Note that the conversion of ##<4 x half> %1##, to ##<4 x half> %a# is done by mem2reg pass (opt -S mem2reg). I am not sure why this optimization does not work for you on this test. I can either 1) replace the occurrence of variables such as %a by * in the test because we are not really testing the optimization after all or 2) send me your cmake command so that I can build and test in the same way to be able to reproduce and investigate the problem on my side.

For the old test ##arm-neon-intrinsics.c##, I did not run it in https://reviews.llvm.org/D32511 because it has REQUIRES: long-tests and our testing configuration did not run those. Once I run it, I can reproduce the problem and I fixed it in this patch.  Does it still fail for you?

https://reviews.llvm.org/D34161