r319413 - [CodeGen] Add initial support for union members in TBAA

Ivan A. Kosarev via cfe-commits cfe-commits at lists.llvm.org
Thu Nov 30 01:26:39 PST 2017


Author: kosarev
Date: Thu Nov 30 01:26:39 2017
New Revision: 319413

URL: http://llvm.org/viewvc/llvm-project?rev=319413&view=rev
Log:
[CodeGen] Add initial support for union members in TBAA

The basic idea behind this patch is that since in strict aliasing
mode all accesses to union members require their outermost
enclosing union objects to be specified explicitly, then for a
couple given accesses to union members of the form

p->a.b.c...
q->x.y.z...

it is known they can only alias if both p and q point to the same
union type and offset ranges of members a.b.c... and x.y.z...
overlap. Note that the actual types of the members do not matter.

Specifically, in this patch we do the following:

* Make unions to be valid TBAA base access types. This enables
  generation of TBAA type descriptors for unions.

* Encode union types as structures with a single member of a
  special "union member" type. Currently we do not encode
  information about sizes of types, but conceptually such union
  members are considered to be of the size of the whole union.

* Encode accesses to direct and indirect union members, including
  member arrays, as accesses to these special members. All
  accesses to members of a union thus get the same offset, which
  is the offset of the union they are part of. This means the
  existing LLVM TBAA machinery is able to handle such accesses
  with no changes.

While this is already an improvement comparing to the current
situation, that is, representing all union accesses as may-alias
ones, there are further changes planned to complete the support
for unions. One of them is storing information about access sizes
so we can distinct accesses to non-overlapping union members,
including accesses to different elements of member arrays.
Another change is encoding type sizes in order to make it
possible to compute offsets within constant-indexed array
elements. These enhancements will be addressed with separate
patches.

Differential Revision: https://reviews.llvm.org/D39455

Added:
    cfe/trunk/test/CodeGen/tbaa-union.cpp
Modified:
    cfe/trunk/lib/CodeGen/CGExpr.cpp
    cfe/trunk/lib/CodeGen/CodeGenModule.h
    cfe/trunk/lib/CodeGen/CodeGenTBAA.cpp
    cfe/trunk/lib/CodeGen/CodeGenTBAA.h
    cfe/trunk/test/CodeGen/union-tbaa1.c

Modified: cfe/trunk/lib/CodeGen/CGExpr.cpp
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGExpr.cpp?rev=319413&r1=319412&r2=319413&view=diff
==============================================================================
--- cfe/trunk/lib/CodeGen/CGExpr.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGExpr.cpp Thu Nov 30 01:26:39 2017
@@ -3723,9 +3723,6 @@ LValue CodeGenFunction::EmitLValueForFie
   if (base.getTBAAInfo().isMayAlias() ||
           rec->hasAttr<MayAliasAttr>() || FieldType->isVectorType()) {
     FieldTBAAInfo = TBAAAccessInfo::getMayAliasInfo();
-  } else if (rec->isUnion()) {
-    // TODO: Support TBAA for unions.
-    FieldTBAAInfo = TBAAAccessInfo::getMayAliasInfo();
   } else {
     // If no base type been assigned for the base access, then try to generate
     // one for this base lvalue.
@@ -3736,16 +3733,26 @@ LValue CodeGenFunction::EmitLValueForFie
                "Nonzero offset for an access with no base type!");
     }
 
-    // Adjust offset to be relative to the base type.
-    const ASTRecordLayout &Layout =
-        getContext().getASTRecordLayout(field->getParent());
-    unsigned CharWidth = getContext().getCharWidth();
-    if (FieldTBAAInfo.BaseType)
-      FieldTBAAInfo.Offset +=
-          Layout.getFieldOffset(field->getFieldIndex()) / CharWidth;
+    // All union members are encoded to be of the same special type.
+    if (FieldTBAAInfo.BaseType && rec->isUnion())
+      FieldTBAAInfo = TBAAAccessInfo::getUnionMemberInfo(FieldTBAAInfo.BaseType,
+                                                         FieldTBAAInfo.Offset,
+                                                         FieldTBAAInfo.Size);
+
+    // For now we describe accesses to direct and indirect union members as if
+    // they were at the offset of their outermost enclosing union.
+    if (!FieldTBAAInfo.isUnionMember()) {
+      // Adjust offset to be relative to the base type.
+      const ASTRecordLayout &Layout =
+          getContext().getASTRecordLayout(field->getParent());
+      unsigned CharWidth = getContext().getCharWidth();
+      if (FieldTBAAInfo.BaseType)
+        FieldTBAAInfo.Offset +=
+            Layout.getFieldOffset(field->getFieldIndex()) / CharWidth;
 
-    // Update the final access type.
-    FieldTBAAInfo.AccessType = CGM.getTBAATypeInfo(FieldType);
+      // Update the final access type.
+      FieldTBAAInfo.AccessType = CGM.getTBAATypeInfo(FieldType);
+    }
   }
 
   Address addr = base.getAddress();

Modified: cfe/trunk/lib/CodeGen/CodeGenModule.h
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenModule.h?rev=319413&r1=319412&r2=319413&view=diff
==============================================================================
--- cfe/trunk/lib/CodeGen/CodeGenModule.h (original)
+++ cfe/trunk/lib/CodeGen/CodeGenModule.h Thu Nov 30 01:26:39 2017
@@ -688,8 +688,9 @@ public:
   /// getTBAAInfoForSubobject - Get TBAA information for an access with a given
   /// base lvalue.
   TBAAAccessInfo getTBAAInfoForSubobject(LValue Base, QualType AccessType) {
-    if (Base.getTBAAInfo().isMayAlias())
-      return TBAAAccessInfo::getMayAliasInfo();
+    TBAAAccessInfo TBAAInfo = Base.getTBAAInfo();
+    if (TBAAInfo.isMayAlias() || TBAAInfo.isUnionMember())
+      return TBAAInfo;
     return getTBAAAccessInfo(AccessType);
   }
 

Modified: cfe/trunk/lib/CodeGen/CodeGenTBAA.cpp
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenTBAA.cpp?rev=319413&r1=319412&r2=319413&view=diff
==============================================================================
--- cfe/trunk/lib/CodeGen/CodeGenTBAA.cpp (original)
+++ cfe/trunk/lib/CodeGen/CodeGenTBAA.cpp Thu Nov 30 01:26:39 2017
@@ -74,6 +74,10 @@ llvm::MDNode *CodeGenTBAA::getChar() {
   return Char;
 }
 
+llvm::MDNode *CodeGenTBAA::getUnionMemberType(uint64_t Size) {
+  return createScalarTypeNode("union member", getChar(), Size);
+}
+
 static bool TypeHasMayAlias(QualType QTy) {
   // Tagged types have declarations, and therefore may have attributes.
   if (const TagType *TTy = dyn_cast<TagType>(QTy))
@@ -101,9 +105,8 @@ static bool isValidBaseType(QualType QTy
       return false;
     if (RD->hasFlexibleArrayMember())
       return false;
-    // RD can be struct, union, class, interface or enum.
-    // For now, we only handle struct and class.
-    if (RD->isStruct() || RD->isClass())
+    // For now, we do not allow interface classes to be base access types.
+    if (RD->isStruct() || RD->isClass() || RD->isUnion())
       return true;
   }
   return false;
@@ -277,18 +280,27 @@ llvm::MDNode *CodeGenTBAA::getBaseTypeIn
     const RecordDecl *RD = TTy->getDecl()->getDefinition();
     const ASTRecordLayout &Layout = Context.getASTRecordLayout(RD);
     SmallVector<llvm::MDBuilder::TBAAStructField, 4> Fields;
-    for (FieldDecl *Field : RD->fields()) {
-      QualType FieldQTy = Field->getType();
-      llvm::MDNode *TypeNode = isValidBaseType(FieldQTy) ?
-          getBaseTypeInfo(FieldQTy) : getTypeInfo(FieldQTy);
-      if (!TypeNode)
-        return BaseTypeMetadataCache[Ty] = nullptr;
-
-      uint64_t BitOffset = Layout.getFieldOffset(Field->getFieldIndex());
-      uint64_t Offset = Context.toCharUnitsFromBits(BitOffset).getQuantity();
-      uint64_t Size = Context.getTypeSizeInChars(FieldQTy).getQuantity();
-      Fields.push_back(llvm::MDBuilder::TBAAStructField(Offset, Size,
+    if (RD->isUnion()) {
+      // Unions are represented as structures with a single member that has a
+      // special type and occupies the whole object.
+      uint64_t Size = Context.getTypeSizeInChars(Ty).getQuantity();
+      llvm::MDNode *TypeNode = getUnionMemberType(Size);
+      Fields.push_back(llvm::MDBuilder::TBAAStructField(/* Offset= */ 0, Size,
                                                         TypeNode));
+    } else {
+      for (FieldDecl *Field : RD->fields()) {
+        QualType FieldQTy = Field->getType();
+        llvm::MDNode *TypeNode = isValidBaseType(FieldQTy) ?
+            getBaseTypeInfo(FieldQTy) : getTypeInfo(FieldQTy);
+        if (!TypeNode)
+          return nullptr;
+
+        uint64_t BitOffset = Layout.getFieldOffset(Field->getFieldIndex());
+        uint64_t Offset = Context.toCharUnitsFromBits(BitOffset).getQuantity();
+        uint64_t Size = Context.getTypeSizeInChars(FieldQTy).getQuantity();
+        Fields.push_back(llvm::MDBuilder::TBAAStructField(Offset, Size,
+                                                          TypeNode));
+      }
     }
 
     SmallString<256> OutName;
@@ -333,6 +345,8 @@ llvm::MDNode *CodeGenTBAA::getAccessTagI
 
   if (Info.isMayAlias())
     Info = TBAAAccessInfo(getChar(), Info.Size);
+  else if (Info.isUnionMember())
+    Info.AccessType = getUnionMemberType(Info.Size);
 
   if (!Info.AccessType)
     return nullptr;

Modified: cfe/trunk/lib/CodeGen/CodeGenTBAA.h
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenTBAA.h?rev=319413&r1=319412&r2=319413&view=diff
==============================================================================
--- cfe/trunk/lib/CodeGen/CodeGenTBAA.h (original)
+++ cfe/trunk/lib/CodeGen/CodeGenTBAA.h Thu Nov 30 01:26:39 2017
@@ -34,9 +34,10 @@ class CGRecordLayout;
 
 // TBAAAccessKind - A kind of TBAA memory access descriptor.
 enum class TBAAAccessKind : unsigned {
-  Ordinary,
-  MayAlias,
-  Incomplete,
+  Ordinary,     // An ordinary memory access.
+  MayAlias,     // An access that may alias with any other accesses.
+  Incomplete,   // Used to designate pointee values of incomplete types.
+  UnionMember,  // An access to a direct or indirect union member.
 };
 
 // TBAAAccessInfo - Describes a memory access in terms of TBAA.
@@ -77,6 +78,14 @@ struct TBAAAccessInfo {
 
   bool isIncomplete() const { return Kind == TBAAAccessKind::Incomplete; }
 
+  static TBAAAccessInfo getUnionMemberInfo(llvm::MDNode *BaseType,
+                                           uint64_t Offset, uint64_t Size) {
+    return TBAAAccessInfo(TBAAAccessKind::UnionMember, BaseType,
+                          /* AccessType= */ nullptr, Offset, Size);
+  }
+
+  bool isUnionMember() const { return Kind == TBAAAccessKind::UnionMember; }
+
   bool operator==(const TBAAAccessInfo &Other) const {
     return Kind == Other.Kind &&
            BaseType == Other.BaseType &&
@@ -148,6 +157,10 @@ class CodeGenTBAA {
   /// considered to be equivalent to it.
   llvm::MDNode *getChar();
 
+  /// getUnionMemberType - Get metadata that represents the type of union
+  /// members.
+  llvm::MDNode *getUnionMemberType(uint64_t Size);
+
   /// CollectFields - Collect information about the fields of a type for
   /// !tbaa.struct metadata formation. Return false for an unsupported type.
   bool CollectFields(uint64_t BaseOffset,

Added: cfe/trunk/test/CodeGen/tbaa-union.cpp
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/tbaa-union.cpp?rev=319413&view=auto
==============================================================================
--- cfe/trunk/test/CodeGen/tbaa-union.cpp (added)
+++ cfe/trunk/test/CodeGen/tbaa-union.cpp Thu Nov 30 01:26:39 2017
@@ -0,0 +1,100 @@
+// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s -emit-llvm -o - | FileCheck %s
+//
+// Check that we generate correct TBAA information for accesses to union
+// members.
+
+struct X {
+  int a, b;
+  int arr[3];
+  int c, d;
+};
+
+union U {
+  int i;
+  X x;
+  int j;
+};
+
+struct S {
+  U u, v;
+};
+
+union N {
+  int i;
+  S s;
+  int j;
+};
+
+struct R {
+  N n, m;
+};
+
+int f1(U *p) {
+// CHECK-LABEL: _Z2f1P1U
+// CHECK: load i32, i32* {{.*}}, !tbaa [[TAG_U_j:!.*]]
+  return p->j;
+}
+
+int f2(S *p) {
+// CHECK-LABEL: _Z2f2P1S
+// CHECK: load i32, i32* {{.*}}, !tbaa [[TAG_S_u_i:!.*]]
+  return p->u.i;
+}
+
+int f3(S *p) {
+// CHECK-LABEL: _Z2f3P1S
+// CHECK: load i32, i32* {{.*}}, !tbaa [[TAG_S_v_j:!.*]]
+  return p->v.j;
+}
+
+int f4(S *p) {
+// CHECK-LABEL: _Z2f4P1S
+// CHECK: load i32, i32* {{.*}}, !tbaa [[TAG_S_u_x_b:!.*]]
+  return p->u.x.b;
+}
+
+int f5(S *p) {
+// CHECK-LABEL: _Z2f5P1S
+// CHECK: load i32, i32* {{.*}}, !tbaa [[TAG_S_v_x_b:!.*]]
+  return p->v.x.b;
+}
+
+int f6(S *p) {
+// CHECK-LABEL: _Z2f6P1S
+// CHECK: load i32, i32* {{.*}}, !tbaa [[TAG_S_u_x_arr:!.*]]
+  return p->u.x.arr[1];
+}
+
+int f7(S *p) {
+// CHECK-LABEL: _Z2f7P1S
+// CHECK: load i32, i32* {{.*}}, !tbaa [[TAG_S_v_x_arr:!.*]]
+  return p->v.x.arr[1];
+}
+
+int f8(N *p) {
+// CHECK-LABEL: _Z2f8P1N
+// CHECK: load i32, i32* {{.*}}, !tbaa [[TAG_N_s_v_x_c:!.*]]
+  return p->s.v.x.c;
+}
+
+int f9(R *p) {
+// CHECK-LABEL: _Z2f9P1R
+// CHECK: load i32, i32* {{.*}}, !tbaa [[TAG_R_m_s_v_x_c:!.*]]
+  return p->m.s.v.x.c;
+}
+
+// CHECK-DAG: [[TAG_U_j]] = !{[[TYPE_U:!.*]], [[TYPE_union_member:!.*]], i64 0}
+// CHECK-DAG: [[TAG_S_u_i]] = !{[[TYPE_S:!.*]], [[TYPE_union_member]], i64 0}
+// CHECK-DAG: [[TAG_S_u_x_b]] = !{[[TYPE_S:!.*]], [[TYPE_union_member]], i64 0}
+// CHECK-DAG: [[TAG_S_u_x_arr]] = !{[[TYPE_S:!.*]], [[TYPE_union_member]], i64 0}
+// CHECK-DAG: [[TAG_S_v_j]] = !{[[TYPE_S:!.*]], [[TYPE_union_member]], i64 28}
+// CHECK-DAG: [[TAG_S_v_x_b]] = !{[[TYPE_S:!.*]], [[TYPE_union_member]], i64 28}
+// CHECK-DAG: [[TAG_S_v_x_arr]] = !{[[TYPE_S:!.*]], [[TYPE_union_member]], i64 28}
+// CHECK-DAG: [[TAG_N_s_v_x_c]] = !{[[TYPE_N:!.*]], [[TYPE_union_member]], i64 0}
+// CHECK-DAG: [[TAG_R_m_s_v_x_c]] = !{[[TYPE_R:!.*]], [[TYPE_union_member]], i64 56}
+// CHECK-DAG: [[TYPE_U]] = !{!"_ZTS1U", [[TYPE_union_member]], i64 0}
+// CHECK-DAG: [[TYPE_S]] = !{!"_ZTS1S", [[TYPE_U]], i64 0, [[TYPE_U]], i64 28}
+// CHECK-DAG: [[TYPE_N]] = !{!"_ZTS1N", [[TYPE_union_member]], i64 0}
+// CHECK-DAG: [[TYPE_R]] = !{!"_ZTS1R", [[TYPE_N]], i64 0, [[TYPE_N]], i64 56}
+// CHECK-DAG: [[TYPE_union_member]] = !{!"union member", [[TYPE_char:!.*]], i64 0}
+// CHECK-DAG: [[TYPE_char]] = !{!"omnipotent char", {{.*}}, i64 0}

Modified: cfe/trunk/test/CodeGen/union-tbaa1.c
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/union-tbaa1.c?rev=319413&r1=319412&r2=319413&view=diff
==============================================================================
--- cfe/trunk/test/CodeGen/union-tbaa1.c (original)
+++ cfe/trunk/test/CodeGen/union-tbaa1.c Thu Nov 30 01:26:39 2017
@@ -15,30 +15,32 @@ void fred(unsigned Num, int Vec[2], int
 // But no tbaa for the two stores:
 // CHECK: %uw[[UW1:[0-9]*]] = getelementptr
 // CHECK: store{{.*}}%uw[[UW1]]
-// CHECK: tbaa ![[OCPATH:[0-9]+]]
+// CHECK: tbaa [[TAG_vect32_union_member:![0-9]+]]
 // There will be a load after the store, and it will use tbaa. Make sure
 // the check-not above doesn't find it:
 // CHECK: load
   Tmp[*Index][0].uw = Arr[*Index][0] * Num;
 // CHECK: %uw[[UW2:[0-9]*]] = getelementptr
 // CHECK: store{{.*}}%uw[[UW2]]
-// CHECK: tbaa ![[OCPATH]]
+// CHECK: tbaa [[TAG_vect32_union_member]]
   Tmp[*Index][1].uw = Arr[*Index][1] * Num;
 // Same here, don't generate tbaa for the loads:
 // CHECK: %uh[[UH1:[0-9]*]] = bitcast %union.vect32
 // CHECK: %arrayidx[[AX1:[0-9]*]] = getelementptr{{.*}}%uh[[UH1]]
 // CHECK: load i16, i16* %arrayidx[[AX1]]
-// CHECK: tbaa ![[OCPATH]]
+// CHECK: tbaa [[TAG_vect32_union_member]]
 // CHECK: store
   Vec[0] = Tmp[*Index][0].uh[1];
 // CHECK: %uh[[UH2:[0-9]*]] = bitcast %union.vect32
 // CHECK: %arrayidx[[AX2:[0-9]*]] = getelementptr{{.*}}%uh[[UH2]]
 // CHECK: load i16, i16* %arrayidx[[AX2]]
-// CHECK: tbaa ![[OCPATH]]
+// CHECK: tbaa [[TAG_vect32_union_member]]
 // CHECK: store
   Vec[1] = Tmp[*Index][1].uh[1];
   bar(Tmp);
 }
 
-// CHECK-DAG: ![[CHAR:[0-9]+]] = !{!"omnipotent char"
-// CHECK-DAG: ![[OCPATH]] = !{![[CHAR]], ![[CHAR]], i64 0}
+// CHECK-DAG: [[TAG_vect32_union_member]] = !{[[TYPE_vect32:!.*]], [[TYPE_union_member:!.*]], i64 0}
+// CHECK-DAG: [[TYPE_vect32]] = !{!"", [[TYPE_union_member]], i64 0}
+// CHECK-DAG: [[TYPE_union_member]] = !{!"union member", [[TYPE_char:!.*]], i64 0}
+// CHECK-DAG: [[TYPE_char]] = !{!"omnipotent char", {{.*}}}




More information about the cfe-commits mailing list