[LLVMdev] Question about LLVM NEON intrinsics

Fri Sep 21 09:52:00 PDT 2012

On Sep 21, 2012, at 2:54 AM, Eli Friedman <eli.friedman at gmail.com> wrote:

> On Fri, Sep 21, 2012 at 1:28 AM, Sebastien DELDON-GNB
> <sebastien.deldon at st.com> wrote:
>> Hi all,
>> 
>> I would like to know if LLVM Neon intrinsics are designed to support only 'Legal' types for NEON units.
>> Using llc -march=arm -mcpu=cortex-a9 vmax4.ll -o vmax4.s on following ll code:
>> 
>> 
>> ; ModuleID = 'vmax.ll'
>> target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n32"
>> target triple = "armv7-none-linux-androideabi"
>> 
>> define void @vmaxf32(<4 x float> *%C, <4 x float>* %A, <4 x float>* %B) nounwind {
>>    %tmp1 = load <4 x float>* %A
>>    %tmp2 = load <4 x float>* %B
>>    %tmp3 = call <4 x float> @llvm.arm.neon.vmaxs.v4f32(<4 x float> %tmp1, <4 x float> %tmp2)
>>    store <4 x float> %tmp3, <4 x float>* %C
>>    ret void
>> }
>> 
>> declare <4 x float> @llvm.arm.neon.vmaxs.v4f32(<4 x float>, <4 x float>) nounwind readnone
>> 
>> I've got following code generated:
>> 
>> ...
>> vmaxf32:                                @ @vmaxf32
>> @ BB#0:
>>        vld1.64 {d16, d17}, [r2]
>>        vld1.64 {d18, d19}, [r1]
>>        vmax.f32        q8, q9, q8
>>        vst1.64 {d16, d17}, [r0]
>>        bx      lr
>> ...
>> 
>> Now if use <16 x float> vectors instead of <4 x float>:
>> 
>> define void @vmaxf32(<16 x float> *%C, <16 x float>* %A, <16 x float>* %B) nounwind {
>>    %tmp1 = load <16 x float>* %A
>>    %tmp2 = load <16 x float>* %B
>>    %tmp3 = call <16 x float> @llvm.arm.neon.vmaxs.v16f32(<16 x float> %tmp1, <16 x float> %tmp2)
>>    store <16 x float> %tmp3, <16 x float>* %C
>>    ret void
>> }
>> 
>> declare <16 x float> @llvm.arm.neon.vmaxs.v16f32(<16 x float>, <16 x float>) nounwind readnone
>> 
>> llc fails with following message:
>> 
>> SplitVectorResult #0: 0x2258350: v16f32 = llvm.arm.neon.vmaxs 0x2258250, 0x2258050, 0x2258150 [ORD=3] [ID=0]
>> 
>> LLVM ERROR: Do not know how to split the result of this operator!
>> 
>> Is it a BUG ? If yes I'm happy to get some directions on how I can fix it.
> 
> No... platform-specific intrinsics have platform-specific semantics,
> including what types they're defined for. NEON doesn't have 16 x float
> vectors, at least not for that sort of operation.
> 
Right.

These backend intrinsics are designed for support of the functions in arm_neon.h. Any use outside of that context is "there be dragons here" territory.

>> If not I would like to know how to determine valid type for a given LLVM intrinsics.
> 
> The ARM reference manual is probably your best bet for ARM intrinsics.
> 
> -Eli
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev