[clang] [Clang][Sema]: Allow flexible arrays in unions and alone in structs (PR #84428)

Kees Cook via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 21 09:42:14 PDT 2024


https://github.com/kees updated https://github.com/llvm/llvm-project/pull/84428

>From eb5138b45fa450737600050ad8dabdcb27513d42 Mon Sep 17 00:00:00 2001
From: Kees Cook <keescook at chromium.org>
Date: Thu, 7 Mar 2024 17:03:09 -0800
Subject: [PATCH 1/7] [Clang][Sema]: Allow flexible arrays in unions and alone
 in structs

GNU and MSVC have extensions where flexible array members (or their
equivalent) can be in unions or alone in structs. This is already fully
supported in Clang through the 0-sized array ("fake flexible array")
extension or when C99 flexible array members have been syntactically
obfuscated.

Clang needs to explicitly allow these extensions directly for C99
flexible arrays, since they are common code patterns in active use by the
Linux kernel (and other projects). Such projects have been using either
0-sized arrays (which is considered deprecated in favor of C99 flexible
array members) or via obfuscated syntax, both of which complicate their
code bases.

For example, these do not error by default:

union one {
	int a;
	int b[0];
};

union two {
	int a;
	struct {
		struct { } __empty;
		int b[];
	};
};

But this does:

union three {
	int a;
	int b[];
};

Remove the default error diagnostics for this but continue to provide
warnings under Microsoft or GNU extensions checks. This will allow for
a seamless transition for code bases away from 0-sized arrays without
losing existing code patterns. Add explicit checking for the warnings
under various constructions.

Fixes #84565
---
 clang/docs/ReleaseNotes.rst                   |  3 ++
 .../clang/Basic/DiagnosticSemaKinds.td        |  5 --
 clang/lib/Sema/SemaDecl.cpp                   |  8 +--
 clang/test/C/drs/dr5xx.c                      |  2 +-
 clang/test/Sema/flexible-array-in-union.c     | 53 +++++++++++++++++--
 clang/test/Sema/transparent-union.c           |  4 +-
 6 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 1b901a27fd19d1..960ab7e021cf2f 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -214,6 +214,9 @@ Improvements to Clang's diagnostics
 - Clang now diagnoses lambda function expressions being implicitly cast to boolean values, under ``-Wpointer-bool-conversion``.
   Fixes #GH82512.
 
+- ``-Wmicrosoft`` or ``-Wgnu`` is now required to diagnose C99 flexible
+  array members in a union or alone in a struct.
+
 Improvements to Clang's time-trace
 ----------------------------------
 
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index c8dfdc08f5ea07..f09121b8c7ec8f 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6447,9 +6447,6 @@ def ext_c99_flexible_array_member : Extension<
 def err_flexible_array_virtual_base : Error<
   "flexible array member %0 not allowed in "
   "%select{struct|interface|union|class|enum}1 which has a virtual base class">;
-def err_flexible_array_empty_aggregate : Error<
-  "flexible array member %0 not allowed in otherwise empty "
-  "%select{struct|interface|union|class|enum}1">;
 def err_flexible_array_has_nontrivial_dtor : Error<
   "flexible array member %0 of type %1 with non-trivial destruction">;
 def ext_flexible_array_in_struct : Extension<
@@ -6464,8 +6461,6 @@ def ext_flexible_array_empty_aggregate_ms : Extension<
   "flexible array member %0 in otherwise empty "
   "%select{struct|interface|union|class|enum}1 is a Microsoft extension">,
   InGroup<MicrosoftFlexibleArray>;
-def err_flexible_array_union : Error<
-  "flexible array member %0 in a union is not allowed">;
 def ext_flexible_array_union_ms : Extension<
   "flexible array member %0 in a union is a Microsoft extension">,
   InGroup<MicrosoftFlexibleArray>;
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 67e56a917a51de..053122b588246b 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -19357,15 +19357,11 @@ void Sema::ActOnFields(Scope *S, SourceLocation RecLoc, Decl *EnclosingDecl,
         } else if (Record->isUnion())
           DiagID = getLangOpts().MicrosoftExt
                        ? diag::ext_flexible_array_union_ms
-                       : getLangOpts().CPlusPlus
-                             ? diag::ext_flexible_array_union_gnu
-                             : diag::err_flexible_array_union;
+                       : diag::ext_flexible_array_union_gnu;
         else if (NumNamedMembers < 1)
           DiagID = getLangOpts().MicrosoftExt
                        ? diag::ext_flexible_array_empty_aggregate_ms
-                       : getLangOpts().CPlusPlus
-                             ? diag::ext_flexible_array_empty_aggregate_gnu
-                             : diag::err_flexible_array_empty_aggregate;
+                       : diag::ext_flexible_array_empty_aggregate_gnu;
 
         if (DiagID)
           Diag(FD->getLocation(), DiagID)
diff --git a/clang/test/C/drs/dr5xx.c b/clang/test/C/drs/dr5xx.c
index 68bcef78baccd7..13464f78b6a654 100644
--- a/clang/test/C/drs/dr5xx.c
+++ b/clang/test/C/drs/dr5xx.c
@@ -29,7 +29,7 @@ void dr502(void) {
    */
   struct t {
     int i;
-    struct { int a[]; }; /* expected-error {{flexible array member 'a' not allowed in otherwise empty struct}}
+    struct { int a[]; }; /* expected-warning {{flexible array member 'a' in otherwise empty struct is a GNU extension}}
                             c89only-warning {{flexible array members are a C99 feature}}
                             expected-warning {{'' may not be nested in a struct due to flexible array member}}
                           */
diff --git a/clang/test/Sema/flexible-array-in-union.c b/clang/test/Sema/flexible-array-in-union.c
index 5fabfbe0b1eaab..a4b2c5ff27184d 100644
--- a/clang/test/Sema/flexible-array-in-union.c
+++ b/clang/test/Sema/flexible-array-in-union.c
@@ -1,13 +1,58 @@
-// RUN: %clang_cc1 %s -verify=c -fsyntax-only
+// RUN: %clang_cc1 %s -verify -fsyntax-only
 // RUN: %clang_cc1 %s -verify -fsyntax-only -x c++
-// RUN: %clang_cc1 %s -verify -fsyntax-only -fms-compatibility
 // RUN: %clang_cc1 %s -verify -fsyntax-only -fms-compatibility -x c++
+// RUN: %clang_cc1 %s -verify=gnu -fsyntax-only -Wgnu-flexible-array-union-member -Wgnu-empty-struct
+// RUN: %clang_cc1 %s -verify=microsoft -fsyntax-only -fms-compatibility -Wmicrosoft
 
 // The test checks that an attempt to initialize union with flexible array
 // member with an initializer list doesn't crash clang.
 
 
-union { char x[]; } r = {0}; // c-error {{flexible array member 'x' in a union is not allowed}}
+union { char x[]; } r = {0}; /* gnu-warning {{flexible array member 'x' in a union is a GNU extension}}
+                                microsoft-warning {{flexible array member 'x' in a union is a Microsoft extension}}
+                              */
 
-// expected-no-diagnostics
+struct already_hidden {
+  int a;
+  union {
+    int b;
+    struct {
+      struct { } __empty;  // gnu-warning {{empty struct is a GNU extension}}
+      char x[];
+    };
+  };
+};
+
+struct still_zero_sized {
+  struct { } __unused;  // gnu-warning {{empty struct is a GNU extension}}
+  int x[];
+};
+
+struct warn1 {
+  int a;
+  union {
+    int b;
+    char x[]; /* gnu-warning {{flexible array member 'x' in a union is a GNU extension}}
+                 microsoft-warning {{flexible array member 'x' in a union is a Microsoft extension}}
+               */
+  };
+};
+
+struct warn2 {
+  int x[];  /* gnu-warning {{flexible array member 'x' in otherwise empty struct is a GNU extension}}
+               microsoft-warning {{flexible array member 'x' in otherwise empty struct is a Microsoft extension}}
+             */
+};
 
+union warn3 {
+  short x[];  /* gnu-warning {{flexible array member 'x' in a union is a GNU extension}}
+                 microsoft-warning {{flexible array member 'x' in a union is a Microsoft extension}}
+               */
+};
+
+struct quiet1 {
+  int a;
+  short x[];
+};
+
+// expected-no-diagnostics
diff --git a/clang/test/Sema/transparent-union.c b/clang/test/Sema/transparent-union.c
index c134a7a9b1c4d0..f02c2298b51ce1 100644
--- a/clang/test/Sema/transparent-union.c
+++ b/clang/test/Sema/transparent-union.c
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fsyntax-only -verify -Wgnu-flexible-array-union-member %s
 typedef union {
   int *ip;
   float *fp;
@@ -131,7 +131,7 @@ union pr15134v2 {
 
 union pr30520v { void b; } __attribute__((transparent_union)); // expected-error {{field has incomplete type 'void'}}
 
-union pr30520a { int b[]; } __attribute__((transparent_union)); // expected-error {{flexible array member 'b' in a union is not allowed}}
+union pr30520a { int b[]; } __attribute__((transparent_union)); // expected-warning {{flexible array member 'b' in a union is a GNU extension}}
 
 // expected-note at +1 2 {{forward declaration of 'struct stb'}}
 union pr30520s { struct stb b; } __attribute__((transparent_union)); // expected-error {{field has incomplete type 'struct stb'}}

>From fc00111b6f4139c80d384a77fa6ee825ba33ff6c Mon Sep 17 00:00:00 2001
From: Kees Cook <keescook at chromium.org>
Date: Mon, 18 Mar 2024 14:27:38 -0700
Subject: [PATCH 2/7] [Clang][Sema]: Add additional tests for flexible array
 initialization

Provide more positive and negative tests for flexible array
initialization under extensions (in unions, alone in structs), and the
associated warnings/errors across C, C++, and with/without extensions.
---
 clang/test/Sema/flexible-array-in-union.c | 110 +++++++++++++++++++++-
 1 file changed, 105 insertions(+), 5 deletions(-)

diff --git a/clang/test/Sema/flexible-array-in-union.c b/clang/test/Sema/flexible-array-in-union.c
index a4b2c5ff27184d..28d754e46e2fc1 100644
--- a/clang/test/Sema/flexible-array-in-union.c
+++ b/clang/test/Sema/flexible-array-in-union.c
@@ -1,8 +1,8 @@
-// RUN: %clang_cc1 %s -verify -fsyntax-only
-// RUN: %clang_cc1 %s -verify -fsyntax-only -x c++
-// RUN: %clang_cc1 %s -verify -fsyntax-only -fms-compatibility -x c++
-// RUN: %clang_cc1 %s -verify=gnu -fsyntax-only -Wgnu-flexible-array-union-member -Wgnu-empty-struct
-// RUN: %clang_cc1 %s -verify=microsoft -fsyntax-only -fms-compatibility -Wmicrosoft
+// RUN: %clang_cc1 %s -verify=stock,c -fsyntax-only
+// RUN: %clang_cc1 %s -verify=stock,cpp -fsyntax-only -x c++
+// RUN: %clang_cc1 %s -verify=stock,cpp -fsyntax-only -fms-compatibility -x c++
+// RUN: %clang_cc1 %s -verify=stock,c,gnu -fsyntax-only -Wgnu-flexible-array-union-member -Wgnu-empty-struct
+// RUN: %clang_cc1 %s -verify=stock,c,microsoft -fsyntax-only -fms-compatibility -Wmicrosoft
 
 // The test checks that an attempt to initialize union with flexible array
 // member with an initializer list doesn't crash clang.
@@ -11,6 +11,106 @@
 union { char x[]; } r = {0}; /* gnu-warning {{flexible array member 'x' in a union is a GNU extension}}
                                 microsoft-warning {{flexible array member 'x' in a union is a Microsoft extension}}
                               */
+struct _name1 {
+  int a;
+  union {
+    int b;
+    char x[]; /* gnu-warning {{flexible array member 'x' in a union is a GNU extension}}
+                 microsoft-warning {{flexible array member 'x' in a union is a Microsoft extension}}
+               */
+  };
+} name1 = {
+  10,
+  42,        /* initializes "b" */
+};
+
+struct _name1i {
+  int a;
+  union {
+    int b;
+    char x[]; /* gnu-warning {{flexible array member 'x' in a union is a GNU extension}}
+                 microsoft-warning {{flexible array member 'x' in a union is a Microsoft extension}}
+               */
+  };
+} name1i = {
+  .a = 10,
+  .b = 42,
+};
+
+/* Initialization of flexible array in a union is never allowed. */
+struct _name2 {
+  int a;
+  union {
+    int b;
+    char x[]; /* gnu-warning {{flexible array member 'x' in a union is a GNU extension}}
+                 microsoft-warning {{flexible array member 'x' in a union is a Microsoft extension}}
+                 stock-note {{initialized flexible array member 'x' is here}}
+               */
+  };
+} name2 = {
+  12,
+  13,
+  { 'c' },   /* stock-error {{initialization of flexible array member is not allowed}} */
+};
+
+/* Initialization of flexible array in a union is never allowed. */
+struct _name2i {
+  int a;
+  union {
+    int b;
+    char x[]; /* gnu-warning {{flexible array member 'x' in a union is a GNU extension}}
+                 microsoft-warning {{flexible array member 'x' in a union is a Microsoft extension}}
+                 stock-note {{initialized flexible array member 'x' is here}}
+               */
+  };
+} name2i = {
+  .a = 12,
+  .b = 13,      /* stock-note {{previous initialization is here}} */
+  .x = { 'c' }, /* stock-error {{initialization of flexible array member is not allowed}}
+                   c-warning {{initializer overrides prior initialization of this subobject}}
+                   cpp-error {{initializer partially overrides prior initialization of this subobject}}
+                 */
+};
+
+/* Flexible array initialization always allowed when not in a union,
+   and when struct has another member.
+ */
+struct _okay {
+  int a;
+  char x[];
+} okay = {
+  22,
+  { 'x', 'y', 'z' },
+};
+
+struct _okayi {
+  int a;
+  char x[];
+} okayi = {
+  .a = 22,
+  .x = { 'x', 'y', 'z' },
+};
+
+struct _okay0 {
+  int a;
+  char x[];
+} okay0 = { };
+
+struct _flex_extension {
+  char x[]; /* gnu-warning {{flexible array member 'x' in otherwise empty struct is a GNU extension}}
+               microsoft-warning {{flexible array member 'x' in otherwise empty struct is a Microsoft extension}}
+             */
+} flex_extension = {
+  { 'x', 'y', 'z' },
+};
+
+struct _flex_extensioni {
+  char x[]; /* gnu-warning {{flexible array member 'x' in otherwise empty struct is a GNU extension}}
+               microsoft-warning {{flexible array member 'x' in otherwise empty struct is a Microsoft extension}}
+             */
+} flex_extensioni = {
+  .x = { 'x', 'y', 'z' },
+};
 
 struct already_hidden {
   int a;

>From 89c775387b0ce2b427e7b3f99b9c5df9062d3f6c Mon Sep 17 00:00:00 2001
From: Kees Cook <keescook at chromium.org>
Date: Mon, 18 Mar 2024 22:26:02 -0700
Subject: [PATCH 3/7] [Clang][Sema]: Add code-gen tests for flexible arrays in
 unions

Most things appear to be working as expected, but there is one case that
does not behave as expected: using an unnamed initializer when a flexible
array is not the first member in a union. It appears to get applied
(but it must be a zero initializer, "{}"), instead of being skipped. The
old style of evading the syntax checker does the right thing, so this
is a distinct behavioral difference between the bypass and the attempt
to use flexible arrays directly in a union.

This is marked with TODO: in the test case.
---
 clang/test/CodeGen/flexible-array-init.c  | 68 ++++++++++++++++++++++-
 clang/test/Sema/flexible-array-in-union.c | 29 ++++++++++
 2 files changed, 96 insertions(+), 1 deletion(-)

diff --git a/clang/test/CodeGen/flexible-array-init.c b/clang/test/CodeGen/flexible-array-init.c
index bae926da5feb07..c0021a8dd6f095 100644
--- a/clang/test/CodeGen/flexible-array-init.c
+++ b/clang/test/CodeGen/flexible-array-init.c
@@ -3,9 +3,15 @@
 struct { int x; int y[]; } a = { 1, 7, 11 };
 // CHECK: @a ={{.*}} global { i32, [2 x i32] } { i32 1, [2 x i32] [i32 7, i32 11] }
 
+struct { int y[]; } a1 = { 8, 12 };
+// CHECK: @a1 ={{.*}} global { [2 x i32] } { [2 x i32] [i32 8, i32 12] }
+
 struct { int x; int y[]; } b = { 1, { 13, 15 } };
 // CHECK: @b ={{.*}} global { i32, [2 x i32] } { i32 1, [2 x i32] [i32 13, i32 15] }
 
+struct { int y[]; } b1 = { { 14, 16 } };
+// CHECK: @b1 ={{.*}} global { [2 x i32] } { [2 x i32] [i32 14, i32 16] }
+
 // sizeof(c) == 8, so this global should be at least 8 bytes.
 struct { int x; char c; char y[]; } c = { 1, 2, { 13, 15 } };
 // CHECK: @c ={{.*}} global { i32, i8, [2 x i8] } { i32 1, i8 2, [2 x i8] c"\0D\0F" }
@@ -21,10 +27,70 @@ struct __attribute((packed, aligned(4))) { char a; int x; char z[]; } e = { 1, 2
 struct { int x; char y[]; } f = { 1, { 13, 15 } };
 // CHECK: @f ={{.*}} global <{ i32, [2 x i8] }> <{ i32 1, [2 x i8] c"\0D\0F" }>
 
+struct __attribute((packed)) { short a; char z[]; } g = { 2, { 11, 13, 15 } };
+// CHECK: @g ={{.*}} <{ i16, [3 x i8] }> <{ i16 2, [3 x i8] c"\0B\0D\0F" }>,
+
+// Last member is the potential flexible array, unnamed initializer skips it.
+struct { int a; union { int b; short x; }; int c; int d; } h = {1, 2, {}, 3};
+// CHECK: @h = global %struct.anon{{.*}} { i32 1, %union.anon{{.*}} { i32 2 }, i32 0, i32 3 }
+struct { int a; union { int b; short x[0]; }; int c; int d; } h0 = {1, 2, {}, 3};
+// CHECK: @h0 = global %struct.anon{{.*}} { i32 1, %union.anon{{.*}} { i32 2 }, i32 0, i32 3 }
+struct { int a; union { int b; short x[1]; }; int c; int d; } h1 = {1, 2, {}, 3};
+// CHECK: @h1 = global %struct.anon{{.*}} { i32 1, %union.anon{{.*}} { i32 2 }, i32 0, i32 3 }
+struct {
+  int a;
+  union {
+    int b;
+    struct {
+      struct { } __ununsed;
+      short x[];
+    };
+  };
+  int c;
+  int d;
+} hiding = {1, 2, {}, 3};
+// CHECK: @hiding = global %struct.anon{{.*}} { i32 1, %union.anon{{.*}} { i32 2 }, i32 0, i32 3 }
+struct { int a; union { int b; short x[]; }; int c; int d; } hf = {1, 2, {}, 3};
+// TODO: @hf = global %struct.anon{{.*}} { i32 1, %union.anon{{.*}} { i32 2 }, i32 0, i32 3 }
+
+// First member is the potential flexible array, initialization requires braces.
+struct { int a; union { short x; int b; }; int c; int d; } i = {1, 2, {}, 3};
+// CHECK: @i = global { i32, { i16, [2 x i8] }, i32, i32 } { i32 1, { i16, [2 x i8] } { i16 2, [2 x i8] undef }, i32 0, i32 3 }
+struct { int a; union { short x[0]; int b; }; int c; int d; } i0 = {1, {}, 2, 3};
+// CHECK: @i0 = global { i32, { [0 x i16], [4 x i8] }, i32, i32 } { i32 1, { [0 x i16], [4 x i8] } { [0 x i16] zeroinitializer, [4 x i8] undef }, i32 2, i32 3 }
+struct { int a; union { short x[1]; int b; }; int c; int d; } i1 = {1, {2}, {}, 3};
+// CHECK: @i1 = global { i32, { [1 x i16], [2 x i8] }, i32, i32 } { i32 1, { [1 x i16], [2 x i8] } { [1 x i16] [i16 2], [2 x i8] undef }, i32 0, i32 3 }
+struct { int a; union { short x[]; int b; }; int c; int d; } i_f = {4, {}, {}, 6};
+// CHECK: @i_f = global { i32, { [0 x i16], [4 x i8] }, i32, i32 } { i32 4, { [0 x i16], [4 x i8] } { [0 x i16] zeroinitializer, [4 x i8] undef }, i32 0, i32 6 }
+
+// Named initializers; order doesn't matter.
+struct { int a; union { int b; short x; }; int c; int d; } hn = {.a = 1, .x = 2, .c = 3};
+// CHECK: @hn = global { i32, { i16, [2 x i8] }, i32, i32 } { i32 1, { i16, [2 x i8] } { i16 2, [2 x i8] undef }, i32 3, i32 0 }
+struct { int a; union { int b; short x[0]; }; int c; int d; } hn0 = {.a = 1, .x = {2}, .c = 3};
+// CHECK: @hn0 = global { i32, { [0 x i16], [4 x i8] }, i32, i32 } { i32 1, { [0 x i16], [4 x i8] } { [0 x i16] zeroinitializer, [4 x i8] undef }, i32 3, i32 0 }
+struct { int a; union { int b; short x[1]; }; int c; int d; } hn1 = {.a = 1, .x = {2}, .c = 3};
+// CHECK: @hn1 = global { i32, { [1 x i16], [2 x i8] }, i32, i32 } { i32 1, { [1 x i16], [2 x i8] } { [1 x i16] [i16 2], [2 x i8] undef }, i32 3, i32 0 }
+
 union {
   struct {
     int a;
     char b[];
   } x;
 } in_union = {};
-// CHECK: @in_union ={{.*}} global %union.anon zeroinitializer
+// CHECK: @in_union ={{.*}} global %union.anon{{.*}} zeroinitializer, align 4
+
+union {
+  int a;
+  char b[];
+} only_union = {};
+// CHECK: @only_union ={{.*}} global %union.anon{{.*}} zeroinitializer, align 4
+
+union {
+  char a[];
+} empty_union = {};
+// CHECK: @empty_union ={{.*}} global %union.anon{{.*}} zeroinitializer, align 1
+
+struct {
+  char a[];
+} empty_struct = {};
+// CHECK: @empty_struct ={{.*}} global %struct.anon{{.*}} zeroinitializer, align 1
diff --git a/clang/test/Sema/flexible-array-in-union.c b/clang/test/Sema/flexible-array-in-union.c
index 28d754e46e2fc1..f39052f3fb51f3 100644
--- a/clang/test/Sema/flexible-array-in-union.c
+++ b/clang/test/Sema/flexible-array-in-union.c
@@ -155,4 +155,33 @@ struct quiet1 {
   short x[];
 };
 
+struct _not_at_end {
+  union { short x[]; }; /* stock-warning-re {{field '' with variable sized type '{{.*}}' not at the end of a struct or class is a GNU extension}}
+                           gnu-warning {{flexible array member 'x' in a union is a GNU extension}}
+                           microsoft-warning {{flexible array member 'x' in a union is a Microsoft extension}}
+                         */
+  int y;
+} not_at_end = {{}, 3};
+
+struct _not_at_end_s {
+  struct { int a; short x[]; }; /* stock-warning-re {{field '' with variable sized type '{{.*}}' not at the end of a struct or class is a GNU extension}} */
+  int y;
+} not_at_end_s = {{}, 3};
+
+struct {
+  int a;
+  union {      /* stock-warning-re {{field '' with variable sized type '{{.*}}' not at the end of a struct or class is a GNU extension}} */
+    short x[]; /* stock-note {{initialized flexible array member 'x' is here}}
+                  gnu-warning {{flexible array member 'x' in a union is a GNU extension}}
+                  microsoft-warning {{flexible array member 'x' in a union is a Microsoft extension}}
+                */
+    int b;
+  };
+  int c;
+  int d;
+} i_f = { 4,
+         {5},  /* stock-error {{initialization of flexible array member is not allowed}} */
+         {},
+          6};
+
 // expected-no-diagnostics

>From 138ccee2734740915b603036ca9a7edd04b77250 Mon Sep 17 00:00:00 2001
From: Kees Cook <keescook at chromium.org>
Date: Tue, 19 Mar 2024 21:24:04 -0700
Subject: [PATCH 4/7] [Clang][Sema]: Do not consume flex array initializers in
 unions

Only process flexible arrays in unions if they are the first (or only)
member.

Add codegen tests for "= {0}" cases.
---
 clang/lib/Sema/SemaInit.cpp               |  1 +
 clang/test/CodeGen/flexible-array-init.c  | 48 +++++++++++++----------
 clang/test/Sema/flexible-array-in-union.c |  5 ++-
 3 files changed, 31 insertions(+), 23 deletions(-)

diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index 011deed7a9a9f1..79cf2eed46fe3d 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -2437,6 +2437,7 @@ void InitListChecker::CheckStructUnionTypes(
   }
 
   if (Field == FieldEnd || !Field->getType()->isIncompleteArrayType() ||
+      (RD->isUnion() && Field != RD->field_begin()) ||
       Index >= IList->getNumInits())
     return;
 
diff --git a/clang/test/CodeGen/flexible-array-init.c b/clang/test/CodeGen/flexible-array-init.c
index c0021a8dd6f095..c93d7b9ca51a8a 100644
--- a/clang/test/CodeGen/flexible-array-init.c
+++ b/clang/test/CodeGen/flexible-array-init.c
@@ -51,7 +51,7 @@ struct {
 } hiding = {1, 2, {}, 3};
 // CHECK: @hiding = global %struct.anon{{.*}} { i32 1, %union.anon{{.*}} { i32 2 }, i32 0, i32 3 }
 struct { int a; union { int b; short x[]; }; int c; int d; } hf = {1, 2, {}, 3};
-// TODO: @hf = global %struct.anon{{.*}} { i32 1, %union.anon{{.*}} { i32 2 }, i32 0, i32 3 }
+// CHECK: @hf = global %struct.anon{{.*}} { i32 1, %union.anon{{.*}} { i32 2 }, i32 0, i32 3 }
 
 // First member is the potential flexible array, initialization requires braces.
 struct { int a; union { short x; int b; }; int c; int d; } i = {1, 2, {}, 3};
@@ -71,26 +71,32 @@ struct { int a; union { int b; short x[0]; }; int c; int d; } hn0 = {.a = 1, .x
 struct { int a; union { int b; short x[1]; }; int c; int d; } hn1 = {.a = 1, .x = {2}, .c = 3};
 // CHECK: @hn1 = global { i32, { [1 x i16], [2 x i8] }, i32, i32 } { i32 1, { [1 x i16], [2 x i8] } { [1 x i16] [i16 2], [2 x i8] undef }, i32 3, i32 0 }
 
-union {
-  struct {
-    int a;
-    char b[];
-  } x;
-} in_union = {};
-// CHECK: @in_union ={{.*}} global %union.anon{{.*}} zeroinitializer, align 4
+struct { char a[]; } empty_struct = {};
+// CHECK: @empty_struct ={{.*}} global %struct.anon{{.*}} zeroinitializer, align 1
 
-union {
-  int a;
-  char b[];
-} only_union = {};
-// CHECK: @only_union ={{.*}} global %union.anon{{.*}} zeroinitializer, align 4
+struct { char a[]; } empty_struct0 = {0};
+// CHECK: @empty_struct0 = global { [1 x i8] } zeroinitializer, align 1
 
-union {
-  char a[];
-} empty_union = {};
-// CHECK: @empty_union ={{.*}} global %union.anon{{.*}} zeroinitializer, align 1
+union { struct { int a; char b[]; }; } struct_in_union = {};
+// CHECK: @struct_in_union = global %union.anon{{.*}} zeroinitializer, align 4
 
-struct {
-  char a[];
-} empty_struct = {};
-// CHECK: @empty_struct ={{.*}} global %struct.anon{{.*}} zeroinitializer, align 1
+union { struct { int a; char b[]; }; } struct_in_union0 = {0};
+// CHECK: @struct_in_union0 = global %union.anon{{.*}} zeroinitializer, align 4
+
+union { int a; char b[]; } trailing_in_union = {};
+// CHECK: @trailing_in_union = global %union.anon{{.*}} zeroinitializer, align 4
+
+union { int a; char b[]; } trailing_in_union0 = {0};
+// CHECK: @trailing_in_union0 = global %union.anon{{.*}} zeroinitializer, align 4
+
+union { char a[]; } only_in_union = {};
+// CHECK: @only_in_union = global %union.anon{{.*}} zeroinitializer, align 1
+
+union { char a[]; } only_in_union0 = {0};
+// CHECK: @only_in_union0 = global %union.anon{{.*}} undef, align 1
+
+union { char a[]; int b; } first_in_union = {};
+// CHECK: @first_in_union = global { [0 x i8], [4 x i8] } { [0 x i8] zeroinitializer, [4 x i8] undef }, align 4
+
+union { char a[]; int b; } first_in_union0 = {};
+// CHECK: @first_in_union0 = global { [0 x i8], [4 x i8] } { [0 x i8] zeroinitializer, [4 x i8] undef }, align 4
diff --git a/clang/test/Sema/flexible-array-in-union.c b/clang/test/Sema/flexible-array-in-union.c
index f39052f3fb51f3..dd5e8069665fea 100644
--- a/clang/test/Sema/flexible-array-in-union.c
+++ b/clang/test/Sema/flexible-array-in-union.c
@@ -44,13 +44,14 @@ struct _name2 {
     int b;
     char x[]; /* gnu-warning {{flexible array member 'x' in a union is a GNU extension}}
                  microsoft-warning {{flexible array member 'x' in a union is a Microsoft extension}}
-                 stock-note {{initialized flexible array member 'x' is here}}
                */
   };
 } name2 = {
   12,
   13,
-  { 'c' },   /* stock-error {{initialization of flexible array member is not allowed}} */
+  { 'c' },   /* c-warning {{excess elements in struct initializer}}
+                cpp-error {{excess elements in struct initializer}}
+              */
 };
 
 /* Initialization of flexible array in a union is never allowed. */

>From 06bc935771399672d0140340b226b9c2b04e13fd Mon Sep 17 00:00:00 2001
From: Kees Cook <keescook at chromium.org>
Date: Tue, 19 Mar 2024 22:58:01 -0700
Subject: [PATCH 5/7] [Clang][Sema] Add flex array init codegen test for C++

Verify that C++ flexible array initialization matches C's (and is
unchanged by the proposed PR).
---
 clang/test/CodeGen/flexible-array-init.cpp | 25 ++++++++++++++++++++++
 1 file changed, 25 insertions(+)
 create mode 100644 clang/test/CodeGen/flexible-array-init.cpp

diff --git a/clang/test/CodeGen/flexible-array-init.cpp b/clang/test/CodeGen/flexible-array-init.cpp
new file mode 100644
index 00000000000000..d3f0a15cad486c
--- /dev/null
+++ b/clang/test/CodeGen/flexible-array-init.cpp
@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 -triple i386-unknown-unknown -x c++ -emit-llvm -o - %s | FileCheck %s
+
+union _u { char a[]; } u = {};
+union _u0 { char a[]; } u0 = {0};
+
+// CHECK: %union._u = type { [0 x i8] }
+// CHECK: %union._u0 = type { [0 x i8] }
+
+// CHECK: @u = global %union._u zeroinitializer, align 1
+// CHECK: @u0 = global %union._u0 undef, align 1
+
+union { char a[]; } z = {};
+// CHECK: @z = internal global %union.{{.*}} zeroinitializer, align 1
+union { char a[]; } z0 = {0};
+// CHECK: @z0 = internal global %union.{{.*}} undef, align 1
+
+/* C++ requires global anonymous unions have static storage, so we have to
+   reference them to keep them in the IR output. */
+char keep(int pick)
+{
+	if (pick)
+		return z.a[0];
+	else
+		return z0.a[0];
+}

>From ecb15f16d342907efa33fbcd3432be4db0f9d8b1 Mon Sep 17 00:00:00 2001
From: Kees Cook <keescook at chromium.org>
Date: Wed, 20 Mar 2024 22:56:22 -0700
Subject: [PATCH 6/7] [Clang][Sema] Actually initialize flexible arrays in
 unions

When handling list initializers, SemaInit was bailing out when it found
a flexible array. Instead, it should continue to associate the field
with the initializer. Simplify the checking for Unions. Update the tests
now that they are correctly creating a zero initializer instead of
"undef".
---
 clang/lib/Sema/SemaInit.cpp                | 10 +++++-----
 clang/test/CodeGen/flexible-array-init.c   |  2 +-
 clang/test/CodeGen/flexible-array-init.cpp |  5 ++---
 3 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index 79cf2eed46fe3d..32e5675aa8e1ae 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -2330,12 +2330,13 @@ void InitListChecker::CheckStructUnionTypes(
       break;
     }
 
-    // We've already initialized a member of a union. We're done.
+    // We've already initialized a member of a union. We can stop entirely.
     if (InitializedSomething && RD->isUnion())
-      break;
+      return;
 
-    // If we've hit the flexible array member at the end, we're done.
-    if (Field->getType()->isIncompleteArrayType())
+    // If we've hit a flexible array member, only allow zero initialization.
+    // Other values are not yet supported. See commit 5955a0f9375a.
+    if (Field->getType()->isIncompleteArrayType() && !IsZeroInitializer(Init))
       break;
 
     if (Field->isUnnamedBitfield()) {
@@ -2437,7 +2438,6 @@ void InitListChecker::CheckStructUnionTypes(
   }
 
   if (Field == FieldEnd || !Field->getType()->isIncompleteArrayType() ||
-      (RD->isUnion() && Field != RD->field_begin()) ||
       Index >= IList->getNumInits())
     return;
 
diff --git a/clang/test/CodeGen/flexible-array-init.c b/clang/test/CodeGen/flexible-array-init.c
index c93d7b9ca51a8a..9e5cfbce7bc6f0 100644
--- a/clang/test/CodeGen/flexible-array-init.c
+++ b/clang/test/CodeGen/flexible-array-init.c
@@ -93,7 +93,7 @@ union { char a[]; } only_in_union = {};
 // CHECK: @only_in_union = global %union.anon{{.*}} zeroinitializer, align 1
 
 union { char a[]; } only_in_union0 = {0};
-// CHECK: @only_in_union0 = global %union.anon{{.*}} undef, align 1
+// CHECK: @only_in_union0 = global { [1 x i8] } zeroinitializer, align 1
 
 union { char a[]; int b; } first_in_union = {};
 // CHECK: @first_in_union = global { [0 x i8], [4 x i8] } { [0 x i8] zeroinitializer, [4 x i8] undef }, align 4
diff --git a/clang/test/CodeGen/flexible-array-init.cpp b/clang/test/CodeGen/flexible-array-init.cpp
index d3f0a15cad486c..d067a614e1afe5 100644
--- a/clang/test/CodeGen/flexible-array-init.cpp
+++ b/clang/test/CodeGen/flexible-array-init.cpp
@@ -4,15 +4,14 @@ union _u { char a[]; } u = {};
 union _u0 { char a[]; } u0 = {0};
 
 // CHECK: %union._u = type { [0 x i8] }
-// CHECK: %union._u0 = type { [0 x i8] }
 
 // CHECK: @u = global %union._u zeroinitializer, align 1
-// CHECK: @u0 = global %union._u0 undef, align 1
+// CHECK: @u0 = global { [1 x i8] } zeroinitializer, align 1
 
 union { char a[]; } z = {};
 // CHECK: @z = internal global %union.{{.*}} zeroinitializer, align 1
 union { char a[]; } z0 = {0};
-// CHECK: @z0 = internal global %union.{{.*}} undef, align 1
+// CHECK: @z0 = internal global { [1 x i8] } zeroinitializer, align 1
 
 /* C++ requires global anonymous unions have static storage, so we have to
    reference them to keep them in the IR output. */

>From 3a784ff64972c957940fc298f423493ec55984c5 Mon Sep 17 00:00:00 2001
From: Kees Cook <keescook at chromium.org>
Date: Thu, 21 Mar 2024 09:40:10 -0700
Subject: [PATCH 7/7] [Clang][SemaInit] Flex array member: Correctly use
 StructuredList->setInitializedFieldInUnion()

I found the right place for StructuredList->setInitializedFieldInUnion.
Remove my misunderstanding about zero initializers.
---
 clang/lib/Sema/SemaInit.cpp | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index 32e5675aa8e1ae..b910588e0017ec 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -2334,9 +2334,8 @@ void InitListChecker::CheckStructUnionTypes(
     if (InitializedSomething && RD->isUnion())
       return;
 
-    // If we've hit a flexible array member, only allow zero initialization.
-    // Other values are not yet supported. See commit 5955a0f9375a.
-    if (Field->getType()->isIncompleteArrayType() && !IsZeroInitializer(Init))
+    // Stop if we've hit a flexible array member.
+    if (Field->getType()->isIncompleteArrayType())
       break;
 
     if (Field->isUnnamedBitfield()) {
@@ -2458,6 +2457,11 @@ void InitListChecker::CheckStructUnionTypes(
   else
     CheckImplicitInitList(MemberEntity, IList, Field->getType(), Index,
                           StructuredList, StructuredIndex);
+
+  if (RD->isUnion() && StructuredList) {
+    // Initialize the first field within the union.
+    StructuredList->setInitializedFieldInUnion(*Field);
+  }
 }
 
 /// Expand a field designator that refers to a member of an



More information about the cfe-commits mailing list