[LLVMdev] Question about LLVM NEON intrinsics

Fri Sep 21 01:28:54 PDT 2012

Hi all,

I would like to know if LLVM Neon intrinsics are designed to support only 'Legal' types for NEON units.
Using llc -march=arm -mcpu=cortex-a9 vmax4.ll -o vmax4.s on following ll code:

; ModuleID = 'vmax.ll'
target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n32"
target triple = "armv7-none-linux-androideabi"

define void @vmaxf32(<4 x float> *%C, <4 x float>* %A, <4 x float>* %B) nounwind {
    %tmp1 = load <4 x float>* %A
    %tmp2 = load <4 x float>* %B
    %tmp3 = call <4 x float> @llvm.arm.neon.vmaxs.v4f32(<4 x float> %tmp1, <4 x float> %tmp2)    
    store <4 x float> %tmp3, <4 x float>* %C
    ret void
}

declare <4 x float> @llvm.arm.neon.vmaxs.v4f32(<4 x float>, <4 x float>) nounwind readnone

I've got following code generated:

...
vmaxf32:                                @ @vmaxf32
@ BB#0:
	vld1.64	{d16, d17}, [r2]
	vld1.64	{d18, d19}, [r1]
	vmax.f32	q8, q9, q8
	vst1.64	{d16, d17}, [r0]
	bx	lr
...

Now if use <16 x float> vectors instead of <4 x float>:

define void @vmaxf32(<16 x float> *%C, <16 x float>* %A, <16 x float>* %B) nounwind {
    %tmp1 = load <16 x float>* %A
    %tmp2 = load <16 x float>* %B
    %tmp3 = call <16 x float> @llvm.arm.neon.vmaxs.v16f32(<16 x float> %tmp1, <16 x float> %tmp2)    
    store <16 x float> %tmp3, <16 x float>* %C
    ret void
}

declare <16 x float> @llvm.arm.neon.vmaxs.v16f32(<16 x float>, <16 x float>) nounwind readnone

llc fails with following message:

SplitVectorResult #0: 0x2258350: v16f32 = llvm.arm.neon.vmaxs 0x2258250, 0x2258050, 0x2258150 [ORD=3] [ID=0]

LLVM ERROR: Do not know how to split the result of this operator!

Is it a BUG ? If yes I'm happy to get some directions on how I can fix it. If not I would like to know how to determine valid type for a given LLVM intrinsics.

Thanks for your answers
Best Regards
Seb