[clang] [CIR] Add initial support for bitfields in structs (PR #142041)

Tue Jun 3 16:37:26 PDT 2025

================
@@ -223,21 +272,114 @@ void CIRRecordLowering::fillOutputFields() {
             fieldTypes.size() - 1;
       // A field without storage must be a bitfield.
       assert(!cir::MissingFeatures::bitfields());
+      if (!member.data)
+        setBitFieldInfo(member.fieldDecl, member.offset, fieldTypes.back());
     }
     assert(!cir::MissingFeatures::cxxSupport());
   }
 }
 
+void CIRRecordLowering::accumulateBitFields(
+    RecordDecl::field_iterator field, RecordDecl::field_iterator fieldEnd) {
+  // Run stores the first element of the current run of bitfields.  FieldEnd is
+  // used as a special value to note that we don't have a current run.  A
+  // bitfield run is a contiguous collection of bitfields that can be stored in
+  // the same storage block.  Zero-sized bitfields and bitfields that would
+  // cross an alignment boundary break a run and start a new one.
+  RecordDecl::field_iterator run = fieldEnd;
+  // Tail is the offset of the first bit off the end of the current run.  It's
+  // used to determine if the ASTRecordLayout is treating these two bitfields as
+  // contiguous.  StartBitOffset is offset of the beginning of the Run.
+  uint64_t startBitOffset, tail = 0;
+  assert(!cir::MissingFeatures::isDiscreteBitFieldABI());
+
+  // Check if OffsetInRecord (the size in bits of the current run) is better
+  // as a single field run. When OffsetInRecord has legal integer width, and
+  // its bitfield offset is naturally aligned, it is better to make the
+  // bitfield a separate storage component so as it can be accessed directly
+  // with lower cost.
+  auto isBetterAsSingleFieldRun = [&](uint64_t offsetInRecord,
+                                      uint64_t startBitOffset,
+                                      uint64_t nextTail = 0) {
+    if (!cirGenTypes.getCGModule().getCodeGenOpts().FineGrainedBitfieldAccesses)
----------------
Andres-Salamanca wrote:

The reason we're not emitting the same layout as CodeGen or look like `FineGrainedBitfieldAccesses` is due to this condition:
`tail == getFieldBitOffset(*field)`

For example, given the following struct:

```c++
struct S {
  char a, b, c;
  unsigned bits : 2;
  unsigned more_bits : 5;
  unsigned still_more_bits : 7;
} a;
```

The data layout looks something like this:
![imagen](https://github.com/user-attachments/assets/fc03b1a5-8ac8-4aab-943f-eeea480e31e2)


Because of the way the layout is defined, the bit-fields are not contiguous in the `ASTRecordLayout`. As a result, CIR is not packing the three bit-fields into a single integer like CodeGen does.
CodeGen emits:
```
%struct.S = type <{ i8, i8, i8, i16, [3 x i8] }>
```
While CIR emits:

```
!rec_S = !cir.record<struct "S" padded {!s8i, !s8i, !s8i, !u8i, !u8i, !cir.array<!u8i x 3>}>
```
Should we consider changing the implementation here?
Alternatively, @andykaylor  idea of representing it like this :

```
!cir.record<struct "S" {!s32i, !cir.bitfield<2>, !cir.bitfield<5>, !cir.bitfield<7>}>
```
We could then defer to the lowering phase

@bcardosolopes 

https://github.com/llvm/llvm-project/pull/142041