r304201 - [ARM] Fix Neon vector type alignment to 64-bit

Tue May 30 09:40:50 PDT 2017

On Tue, May 30, 2017 at 3:12 AM, Javed Absar via cfe-commits
<cfe-commits at lists.llvm.org> wrote:
> Author: javed.absar
> Date: Tue May 30 05:12:15 2017
> New Revision: 304201
>
> URL: http://llvm.org/viewvc/llvm-project?rev=304201&view=rev
> Log:
> [ARM] Fix Neon vector type alignment to 64-bit
>
> The maximum alignment for ARM NEON data types should be 64-bits as specified
> in ARM procedure call standard document Sec. A.2 Notes.
> This patch fixes it from its current larger natural default values, except
> for Android (so as not to break existing ABI).
> Reviewed by: Stephen Hines, Renato Golin.
> Differential Revision: https://reviews.llvm.org/D33205
>
>
> Modified:
>     cfe/trunk/lib/Basic/Targets.cpp
>     cfe/trunk/test/CodeGen/arm-abi-vector.c
>     cfe/trunk/test/CodeGen/arm-neon-misc.c
>     cfe/trunk/test/CodeGen/arm-swiftcall.c
>     cfe/trunk/test/CodeGen/armv7k-abi.c
>
> Modified: cfe/trunk/lib/Basic/Targets.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Basic/Targets.cpp?rev=304201&r1=304200&r2=304201&view=diff
> ==============================================================================
> --- cfe/trunk/lib/Basic/Targets.cpp (original)
> +++ cfe/trunk/lib/Basic/Targets.cpp Tue May 30 05:12:15 2017
> @@ -5382,6 +5382,11 @@ public:
>      // ARM has atomics up to 8 bytes
>      setAtomic();
>
> +    if (Triple.getEnvironment() == llvm::Triple::Android)
> +      MaxVectorAlign = 128; // don't break existing Android ABI
> +    else
> +      MaxVectorAlign = 64; // AAPCS
> +

This is an ABI break on darwin as well.  There are 40 lines of ifs
*just a few lines above* that pick various ABIs for various targets,
because there are multiple AAPCS variants, and the ARM document
(sadly) isn't gospel.

Is there a way to restrict this to vanilla AAPCS?  Maybe sink it into
setABIAAPCS?

>      // Do force alignment of members that follow zero length bitfields.  If
>      // the alignment of the zero-length bitfield is greater than the member
>      // that follows it, `bar', `bar' will be aligned as the  type of the
>
> Modified: cfe/trunk/test/CodeGen/arm-abi-vector.c
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/arm-abi-vector.c?rev=304201&r1=304200&r2=304201&view=diff
> ==============================================================================
> --- cfe/trunk/test/CodeGen/arm-abi-vector.c (original)
> +++ cfe/trunk/test/CodeGen/arm-abi-vector.c Tue May 30 05:12:15 2017
> @@ -133,20 +133,20 @@ double test_5c(__char5 *in) {
>
>  double varargs_vec_9c(int fixed, ...) {
>  // CHECK: varargs_vec_9c
> -// CHECK: [[VAR:%.*]] = alloca <9 x i8>, align 16
> +// CHECK: [[VAR:%.*]] = alloca <9 x i8>, align 8
>  // CHECK: [[ALIGN:%.*]] = and i32 {{%.*}}, -8
>  // CHECK: [[AP_ALIGN:%.*]] = inttoptr i32 [[ALIGN]] to i8*
>  // CHECK: [[AP_NEXT:%.*]] = getelementptr inbounds i8, i8* [[AP_ALIGN]], i32 16
>  // CHECK: [[AP_CAST:%.*]] = bitcast i8* [[AP_ALIGN]] to <9 x i8>*
>  // CHECK: [[T0:%.*]] = load <9 x i8>, <9 x i8>* [[AP_CAST]], align 8
> -// CHECK: store <9 x i8> [[T0]], <9 x i8>* [[VAR]], align 16
> +// CHECK: store <9 x i8> [[T0]], <9 x i8>* [[VAR]], align 8
>  // APCS-GNU: varargs_vec_9c
> -// APCS-GNU: [[VAR:%.*]] = alloca <9 x i8>, align 16
> +// APCS-GNU: [[VAR:%.*]] = alloca <9 x i8>, align 8
>  // APCS-GNU: [[AP:%.*]] = load i8*,
>  // APCS-GNU: [[AP_NEXT:%.*]] = getelementptr inbounds i8, i8* [[AP]], i32 16
>  // APCS-GNU: [[AP_CAST:%.*]] = bitcast i8* [[AP]] to <9 x i8>*
>  // APCS-GNU: [[VEC:%.*]] = load <9 x i8>, <9 x i8>* [[AP_CAST]], align 4
> -// APCS-GNU: store <9 x i8> [[VEC]], <9 x i8>* [[VAR]], align 16
> +// APCS-GNU: store <9 x i8> [[VEC]], <9 x i8>* [[VAR]], align 8
>  // ANDROID: varargs_vec_9c
>  // ANDROID: [[VAR:%.*]] = alloca <9 x i8>, align 16
>  // ANDROID: [[ALIGN:%.*]] = and i32 {{%.*}}, -8
> @@ -246,15 +246,15 @@ double test_3s(__short3 *in) {
>
>  double varargs_vec_5s(int fixed, ...) {
>  // CHECK: varargs_vec_5s
> -// CHECK: [[VAR_ALIGN:%.*]] = alloca <5 x i16>, align 16
> +// CHECK: [[VAR_ALIGN:%.*]] = alloca <5 x i16>, align 8
>  // CHECK: [[ALIGN:%.*]] = and i32 {{%.*}}, -8
>  // CHECK: [[AP_ALIGN:%.*]] = inttoptr i32 [[ALIGN]] to i8*
>  // CHECK: [[AP_NEXT:%.*]] = getelementptr inbounds i8, i8* [[AP_ALIGN]], i32 16
>  // CHECK: [[AP_CAST:%.*]] = bitcast i8* [[AP_ALIGN]] to <5 x i16>*
>  // CHECK: [[VEC:%.*]] = load <5 x i16>, <5 x i16>* [[AP_CAST]], align 8
> -// CHECK: store <5 x i16> [[VEC]], <5 x i16>* [[VAR_ALIGN]], align 16
> +// CHECK: store <5 x i16> [[VEC]], <5 x i16>* [[VAR_ALIGN]], align 8
>  // APCS-GNU: varargs_vec_5s
> -// APCS-GNU: [[VAR:%.*]] = alloca <5 x i16>, align 16
> +// APCS-GNU: [[VAR:%.*]] = alloca <5 x i16>, align 8
>  // APCS-GNU: [[AP:%.*]] = load i8*,
>  // APCS-GNU: [[AP_NEXT:%.*]] = getelementptr inbounds i8, i8* [[AP]], i32 16
>  // APCS-GNU: [[AP_CAST:%.*]] = bitcast i8* [[AP]] to <5 x i16>*
>
> Modified: cfe/trunk/test/CodeGen/arm-neon-misc.c
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/arm-neon-misc.c?rev=304201&r1=304200&r2=304201&view=diff
> ==============================================================================
> --- cfe/trunk/test/CodeGen/arm-neon-misc.c (original)
> +++ cfe/trunk/test/CodeGen/arm-neon-misc.c Tue May 30 05:12:15 2017
> @@ -32,3 +32,11 @@ void t2(uint64_t *src1, uint8_t *src2, u
>      *dst = q;
>  // CHECK: store <2 x i64>
>  }
> +
> +// Neon types have 64-bit alignment
> +int32x4_t gl_b;
> +void t3(int32x4_t *src) {
> +// CHECK: @t3
> +  gl_b = *src;
> +// CHECK: store <4 x i32> {{%.*}}, <4 x i32>* @gl_b, align 8
> +}
>
> Modified: cfe/trunk/test/CodeGen/arm-swiftcall.c
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/arm-swiftcall.c?rev=304201&r1=304200&r2=304201&view=diff
> ==============================================================================
> --- cfe/trunk/test/CodeGen/arm-swiftcall.c (original)
> +++ cfe/trunk/test/CodeGen/arm-swiftcall.c Tue May 30 05:12:15 2017
> @@ -343,7 +343,7 @@ typedef union {
>  } union_hom_fp_partial;
>  TEST(union_hom_fp_partial)
>  // CHECK-LABEL: define void @test_union_hom_fp_partial()
> -// CHECK:   [[TMP:%.*]] = alloca [[REC:%.*]], align 16
> +// CHECK:   [[TMP:%.*]] = alloca [[REC:%.*]], align 8
>  // CHECK:   [[CALL:%.*]] = call [[SWIFTCC]] [[UAGG:{ float, float, float, float }]] @return_union_hom_fp_partial()
>  // CHECK:   [[CAST_TMP:%.*]] = bitcast [[REC]]* [[TMP]] to [[AGG:{ float, float, float, float }]]*
>  // CHECK:   [[T0:%.*]] = getelementptr inbounds [[AGG]], [[AGG]]* [[CAST_TMP]], i32 0, i32 0
> @@ -376,7 +376,7 @@ typedef union {
>  } union_het_fpv_partial;
>  TEST(union_het_fpv_partial)
>  // CHECK-LABEL: define void @test_union_het_fpv_partial()
> -// CHECK:   [[TMP:%.*]] = alloca [[REC:%.*]], align 16
> +// CHECK:   [[TMP:%.*]] = alloca [[REC:%.*]], align 8
>  // CHECK:   [[CALL:%.*]] = call [[SWIFTCC]] [[UAGG:{ i32, i32, float, float }]] @return_union_het_fpv_partial()
>  // CHECK:   [[CAST_TMP:%.*]] = bitcast [[REC]]* [[TMP]] to [[AGG:{ i32, i32, float, float }]]*
>  // CHECK:   [[T0:%.*]] = getelementptr inbounds [[AGG]], [[AGG]]* [[CAST_TMP]], i32 0, i32 0
> @@ -413,7 +413,7 @@ TEST(int4)
>
>  TEST(int8)
>  // CHECK-LABEL: define {{.*}} @return_int8()
> -// CHECK:   [[RET:%.*]] = alloca [[REC:<8 x i32>]], align 32
> +// CHECK:   [[RET:%.*]] = alloca [[REC:<8 x i32>]], align 8
>  // CHECK:   [[VAR:%.*]] = alloca [[REC]], align
>  // CHECK:   store
>  // CHECK:   load
> @@ -457,7 +457,7 @@ TEST(int8)
>
>  TEST(int5)
>  // CHECK-LABEL: define {{.*}} @return_int5()
> -// CHECK:   [[RET:%.*]] = alloca [[REC:<5 x i32>]], align 32
> +// CHECK:   [[RET:%.*]] = alloca [[REC:<5 x i32>]], align 8
>  // CHECK:   [[VAR:%.*]] = alloca [[REC]], align
>  // CHECK:   store
>  // CHECK:   load
>
> Modified: cfe/trunk/test/CodeGen/armv7k-abi.c
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/armv7k-abi.c?rev=304201&r1=304200&r2=304201&view=diff
> ==============================================================================
> --- cfe/trunk/test/CodeGen/armv7k-abi.c (original)
> +++ cfe/trunk/test/CodeGen/armv7k-abi.c Tue May 30 05:12:15 2017
> @@ -83,11 +83,11 @@ typedef struct {
>  OddlySizedStruct return_oddly_sized_struct() {}
>
>  // CHECK: define <4 x float> @test_va_arg_vec(i8* %l)
> -// CHECK:   [[ALIGN_TMP:%.*]] = add i32 {{%.*}}, 15
> -// CHECK:   [[ALIGNED:%.*]] = and i32 [[ALIGN_TMP]], -16
> +// CHECK:   [[ALIGN_TMP:%.*]] = add i32 {{%.*}}, 7
> +// CHECK:   [[ALIGNED:%.*]] = and i32 [[ALIGN_TMP]], -8

Allocas are not a huge deal, but I think this particular change is incompatible.

-Ahmed

>  // CHECK:   [[ALIGNED_I8:%.*]] = inttoptr i32 [[ALIGNED]] to i8*
>  // CHECK:   [[ALIGNED_VEC:%.*]] = bitcast i8* [[ALIGNED_I8]] to <4 x float>
> -// CHECK:   load <4 x float>, <4 x float>* [[ALIGNED_VEC]], align 16
> +// CHECK:   load <4 x float>, <4 x float>* [[ALIGNED_VEC]], align 8
>  float32x4_t test_va_arg_vec(__builtin_va_list l) {
>    return __builtin_va_arg(l, float32x4_t);
>  }
>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits