[PATCH] D34161: [AArch64] Add ARMv8.2-A FP16 vector intrinsics - Continuation

Mon Jun 26 07:29:38 PDT 2017

SjoerdMeijer added a comment.

Unfortunately this is causing problems in testing:

  SplitVectorOperand Op #3: t8: ch = llvm.arm.neon.vst4lane<ST32[%0](align=2)> t3, TargetConstant:i32<699>, FrameIndex:i32<0>, undef:v4f16, undef:v4f16, undef:v4f16, undef:v4f16, Constant:i32<1>, Constant:i32<2>
  fatal error: error in backend: Do not know how to split this operator's operand!

This assert is triggered in lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp:1457

Reproducer:

  typedef __attribute__((neon_vector_type(4))) short uint16x4_t;
  void dotests_1082() {
  __fp16 result[1];
    uint16x4_t __s1_0_3;
    uint16x4_t __s1_0_2;
    uint16x4_t __s1_0_1;
    uint16x4_t __s1_0_0;
    __builtin_neon_vst4_lane_v(result, __s1_0_0, __s1_0_1, __s1_0_2, __s1_0_3,
                               1, 8);
  }

Compile with:

  clang -O2 -target armv8-linux-gnueabihf -mcpu=cortex-a57 -D__ARM_NEON_FP16_INTRINSICS -S t.c

Befor this patch the IR looked like this:

  call void @llvm.arm.neon.vst4lane.p0i8.v4i16(i8* nonnull %0, <4 x i16> undef, <4 x i16> undef, <4 x i16> undef, <4 x i16> undef, i32 1, i32 2)

And now after:

  call void @llvm.arm.neon.vst4lane.p0i8.v4f16(i8* nonnull %0, <4 x half> undef, <4 x half> undef, <4 x half> undef, <4 x half> undef, i32 1, i32 2)

Note that the ##i16## types have changed into ##half## and the intrinsic from ##i16## to ##f16##.
It has problems with legalising type ##v4f16##.

Repository:
  rL LLVM

https://reviews.llvm.org/D34161