[PATCH] D34161: [AArch64] Add ARMv8.2-A FP16 vector intrinsics - Continuation

Fri Jun 16 07:44:48 PDT 2017

SjoerdMeijer added a comment.

Just to avoid any confusion/mistakes, can you upload the final patch that you intend to commit? 
My understanding is that will include: https://reviews.llvm.org/D32511 + https://reviews.llvm.org/D34161 - ##arm-v8.2a-neon-intrinsics.c##.  
And I was indeed accidentally running the old test, but its replacement ##aarch64-neon-intrinsics.c## that 
I was also running is giving me similar problems. This looks like a very easy fix to me: if you add -O2
you don't need run it separately through opt and mem2reg (with and without gives exactly the result for me anyway),
and the only thing you need to change is e.g this expected string:

[[ABS:%.*]] =  call <4 x half> @llvm.fabs.v4f16(<4 x half> %a)

to match this:

%vabs1.i = tail call <4 x half> @llvm.fabs.v4f16(<4 x half> %a)

where the only difference is "tail".

So I agree that is a straightforward bugfix and also the correct thing to do (i.e. using option +fullfp16).

https://reviews.llvm.org/D34161