[PATCH] Fixing a struct layout bug with bitfields before virtual base

Thu Jan 16 12:17:58 PST 2014

ygao added you to the CC list for the revision "Fixing a struct layout bug with bitfields before virtual base".

Hi,
The following patch fixes PR18430 by checking that the size of bitfields plus padding
does not grow into the following virtual base. Without this fix, clang would assert on
this test file:

```
/* test.cpp
 * clang++ -S -o - test.cpp
 */
struct A { virtual void vf2(); };

struct B { virtual void vf1(); };

struct C : A, virtual B {
  char fld0;
  long long fld1 : 48;
  virtual void vf1();
};

void C::vf1() { }
/* end of test.cpp */
```

Codes in CGRecordLayoutBuilder.cpp are trying to grow fld1 from 48 bits to 64 bits. fld1 starts
at 8 bits from the beginning of the struct and the virtual base starts at 64-bits, so if fld1 is
expanded to take 64 bits, it would overlap with the storage space reserved for the virtual base.
The proposed patch is attempting to check that the bitfields do not grow into the virtual base.

I would also like to bring up a related discussion on some codes in the same function. It seems to
me that growing the bitfields is intended for optimization rather than for correctness (?) If that is the
case, growing fld1 from 48 bits to 64 bits seems a little odd because loading that 64 bits would be
an unaligned data access on many targets, and my impression is that unaligned data access is
slow. Maybe it makes more sense to grow fld1 from 48 bits to 56 bits only, so that fld1 will end at
a 64-bit boundary. For example, with the current implementation, clang would generate a 64-bit load
at -O2 on my x86-64 Ubuntu with some masks to get the second bitfield, although it appears that a
32-bit load alone should suffice.

```
struct __attribute__((aligned(16))) s7 {
  char f0;
  long long f1:24;
  long long f2:32;
};

int foo(struct s7 *a0) { return a0->f2; }
```

I have a patch in mind that would do something like this:

```
Index: lib/CodeGen/CGRecordLayoutBuilder.cpp
===================================================================

--- lib/CodeGen/CGRecordLayoutBuilder.cpp	
+++ lib/CodeGen/CGRecordLayoutBuilder.cpp
@@ -353,8 +353,9 @@
          "End offset is not past the end of the known storage bits.");
   uint64_t SpaceBits = EndOffset - FirstFieldOffset;
   uint64_t LongBits = Types.getTarget().getLongWidth();
-  uint64_t WidenedBits = (StorageBits / LongBits) * LongBits +
-                         llvm::NextPowerOf2(StorageBits % LongBits - 1);
+  uint64_t WidenedBits = ((FirstFieldOffset + StorageBits) / LongBits) * LongBits +
+           llvm::NextPowerOf2((FirstFieldOffset + StorageBits) % LongBits - 1) -
+           FirstFieldOffset;
   assert(WidenedBits >= StorageBits && "Widening shrunk the bits!");
   if (WidenedBits <= SpaceBits) {
     StorageBits = WidenedBits;
```

This would change the backend code generation for some struct layouts, so I would like to
have some discussion on whether this is beneficial.

Could someone take a look?

- Gao

http://llvm-reviews.chandlerc.com/D2560

Files:
  lib/CodeGen/CGRecordLayoutBuilder.cpp
  test/CodeGenCXX/bitfield.cpp

Index: lib/CodeGen/CGRecordLayoutBuilder.cpp
===================================================================
--- lib/CodeGen/CGRecordLayoutBuilder.cpp
+++ lib/CodeGen/CGRecordLayoutBuilder.cpp
@@ -345,6 +345,9 @@
   // these in conjunction with shifts and masks to narrower operations where
   // beneficial.
   uint64_t EndOffset = Types.getContext().toBits(Layout.getDataSize());
+  if (Types.getContext().getLangOpts().CPlusPlus)
+    // Do not grow the bitfield storage into the following virtual base.
+    EndOffset = Types.getContext().toBits(Layout.getNonVirtualSize());
   if (BFE != FE)
     // If there are more fields to be laid out, the offset at the end of the
     // bitfield is the offset of the next field in the record.
Index: test/CodeGenCXX/bitfield.cpp
===================================================================
--- test/CodeGenCXX/bitfield.cpp
+++ test/CodeGenCXX/bitfield.cpp
@@ -426,3 +426,55 @@
     s->b2 = x;
   }
 }
+
+namespace N7 {
+  // Similar to N4 except that this adds a virtual base to the picture.
+  // Do NOT widen loads and stores to bitfields into padding at the end of
+  // a class which might end up with members inside of it when inside a derived
+  // class.
+  struct B1 {
+    virtual void f();
+    unsigned b1 : 24;
+  };
+  struct B2 : virtual B1 {
+    virtual ~B2();
+    unsigned b : 24;
+  };
+  // Imagine some other translation unit introduces:
+#if 0
+  struct Derived : public B2 {
+    char c;
+  };
+#endif
+  unsigned read(B2* s) {
+    // FIXME: We should widen this load as long as the function isn't being
+    // instrumented by thread-sanitizer.
+    //
+    // CHECK-X86-64-LABEL: define i32 @_ZN2N74read
+    // CHECK-X86-64:   %[[gep:.*]] = getelementptr inbounds {{.*}}* %{{.*}}, i32 0, i32 1
+    // CHECK-X86-64:   %[[ptr:.*]] = bitcast [3 x i8]* %[[gep]] to i24*
+    // CHECK-X86-64:   %[[val:.*]] = load i24* %[[ptr]]
+    // CHECK-X86-64:   %[[ext:.*]] = zext i24 %[[val]] to i32
+    // CHECK-X86-64:                 ret i32 %[[ext]]
+    // CHECK-PPC64-LABEL: define zeroext i32 @_ZN2N74read
+    // CHECK-PPC64:   %[[gep:.*]] = getelementptr inbounds {{.*}}* %{{.*}}, i32 0, i32 1
+    // CHECK-PPC64:   %[[ptr:.*]] = bitcast [3 x i8]* %[[gep]] to i24*
+    // CHECK-PPC64:   %[[val:.*]] = load i24* %[[ptr]]
+    // CHECK-PPC64:   %[[ext:.*]] = zext i24 %[[val]] to i32
+    // CHECK-PPC64:                 ret i32 %[[ext]]
+    return s->b;
+  }
+  void write(B2* s, unsigned x) {
+    // CHECK-X86-64-LABEL: define void @_ZN2N75write
+    // CHECK-X86-64:   %[[gep:.*]] = getelementptr inbounds {{.*}}* %{{.*}}, i32 0, i32 1
+    // CHECK-X86-64:   %[[ptr:.*]] = bitcast [3 x i8]* %[[gep]] to i24*
+    // CHECK-X86-64:   %[[new:.*]] = trunc i32 %{{.*}} to i24
+    // CHECK-X86-64:                 store i24 %[[new]], i24* %[[ptr]]
+    // CHECK-PPC64-LABEL: define void @_ZN2N75write
+    // CHECK-PPC64:   %[[gep:.*]] = getelementptr inbounds {{.*}}* %{{.*}}, i32 0, i32 1
+    // CHECK-PPC64:   %[[ptr:.*]] = bitcast [3 x i8]* %[[gep]] to i24*
+    // CHECK-PPC64:   %[[new:.*]] = trunc i32 %{{.*}} to i24
+    // CHECK-PPC64:                 store i24 %[[new]], i24* %[[ptr]]
+    s->b = x;
+  }
+}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D2560.1.patch
Type: text/x-patch
Size: 3201 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140116/8a2d0d85/attachment.bin>