[llvm] 119aecb - [DebugInfo] Emit negative DW_AT_bit_offset in explicit signed form (#87994)

Mon May 13 03:14:39 PDT 2024

Author: Victor Campos
Date: 2024-05-13T11:14:35+01:00
New Revision: 119aecb955df91173d69c455bba0abd74271c215

URL: https://github.com/llvm/llvm-project/commit/119aecb955df91173d69c455bba0abd74271c215
DIFF: https://github.com/llvm/llvm-project/commit/119aecb955df91173d69c455bba0abd74271c215.diff

LOG: [DebugInfo] Emit negative DW_AT_bit_offset in explicit signed form (#87994)

Before this patch, the value of DW_AT_bit_offset, used for bitfields
before DWARF version 4, was always emitted as an unsigned integer using
the form DW_FORM_data<n>. If the value was originally a signed integer,
for instance in the case of negative offsets, it was up to debug
information consumers to re-cast it to a signed integer.

This is problematic since the burden of deciding if the value should be
read as signed or unsigned was put onto the debug info consumers: the
DWARF specification doesn't define DW_AT_bit_offset's underlying type.
If a debugger decided to interpret this attribute in the form data<n> as
unsigned, then negative offsets would be completely broken.

The DWARF specification version 3 mentions in the Data Representation
section, page 127:

> If one of the DW_FORM_data<n> forms is used to represent a signed or
unsigned integer, it can be hard for a consumer to discover the context
necessary to determine which interpretation is intended. Producers are
therefore strongly encouraged to use DW_FORM_sdata or DW_FORM_udata for
signed and unsigned integers respectively, rather than DW_FORM_data<n>.

Therefore, the proposal is to use DW_FORM_sdata, which is explicitly
signed. This is an indication to consumers that the offset must be
parsed unambiguously as a signed integer.

Finally, gcc already uses DW_FORM_sdata for negative offsets, fixing the
potential ambiguity altogether.

This patch mimics gcc's behaviour by emitting negative values of
DW_AT_bit_offset using the DW_FORM_sdata form. This eliminates any
potential misinterpretation.

One could argue that all values should use DW_FORM_sdata, but for the
sake of parity with gcc, it is safe to restrict the change to negative
values.

Added: 
    

Modified: 
    llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
    llvm/test/DebugInfo/ARM/bitfield.ll
    llvm/test/DebugInfo/NVPTX/packed_bitfields.ll
    llvm/test/DebugInfo/X86/packed_bitfields.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
index 56c288ee95b43..6533e8281631a 100644

--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
@@ -30,6 +30,7 @@
 #include "llvm/Target/TargetLoweringObjectFile.h"
 #include <cassert>
 #include <cstdint>
+#include <limits>
 #include <string>
 #include <utility>
 
@@ -1649,7 +1650,8 @@ DIE &DwarfUnit::constructMemberDIE(DIE &Buffer, const DIDerivedType *DT) {
         addUInt(MemberDie, dwarf::DW_AT_byte_size, std::nullopt, FieldSize / 8);
       addUInt(MemberDie, dwarf::DW_AT_bit_size, std::nullopt, Size);
 
-      uint64_t Offset = DT->getOffsetInBits();
+      assert(DT->getOffsetInBits() <= std::numeric_limits<int64_t>::max());
+      int64_t Offset = DT->getOffsetInBits();
       // We can't use DT->getAlignInBits() here: AlignInBits for member type
       // is non-zero if and only if alignment was forced (e.g. _Alignas()),
       // which can't be done with bitfields. Thus we use FieldSize here.
@@ -1669,7 +1671,12 @@ DIE &DwarfUnit::constructMemberDIE(DIE &Buffer, const DIDerivedType *DT) {
         if (Asm->getDataLayout().isLittleEndian())
           Offset = FieldSize - (Offset + Size);
 
-        addUInt(MemberDie, dwarf::DW_AT_bit_offset, std::nullopt, Offset);
+        if (Offset < 0)
+          addSInt(MemberDie, dwarf::DW_AT_bit_offset, dwarf::DW_FORM_sdata,
+                  Offset);
+        else
+          addUInt(MemberDie, dwarf::DW_AT_bit_offset, std::nullopt,
+                  (uint64_t)Offset);
         OffsetInBytes = FieldOffset >> 3;
       } else {
         addUInt(MemberDie, dwarf::DW_AT_data_bit_offset, std::nullopt, Offset);

diff  --git a/llvm/test/DebugInfo/ARM/bitfield.ll b/llvm/test/DebugInfo/ARM/bitfield.ll
index 5bd06b785b159..672c61db6f491 100644
--- a/llvm/test/DebugInfo/ARM/bitfield.ll
+++ b/llvm/test/DebugInfo/ARM/bitfield.ll
@@ -12,7 +12,7 @@
 ; CHECK:          DW_AT_name {{.*}} "reserved"
 ; CHECK:          DW_AT_byte_size  {{.*}} (0x04)
 ; CHECK:          DW_AT_bit_size   {{.*}} (0x1c)
-; CHECK:          DW_AT_bit_offset {{.*}} (0xfffffffffffffff8)
+; CHECK:          DW_AT_bit_offset {{.*}} (-8)
 ; CHECK:          DW_AT_data_member_location {{.*}} (DW_OP_plus_uconst 0x0)
 
 %struct.anon = type { i8, [5 x i8] }

diff  --git a/llvm/test/DebugInfo/NVPTX/packed_bitfields.ll b/llvm/test/DebugInfo/NVPTX/packed_bitfields.ll
index e2097d7f49b48..62ffa0a4001f1 100644
--- a/llvm/test/DebugInfo/NVPTX/packed_bitfields.ll
+++ b/llvm/test/DebugInfo/NVPTX/packed_bitfields.ll
@@ -16,7 +16,7 @@
 ; CHECK-NEXT: .b8 1    // DW_AT_byte_size
 ; CHECK-NEXT: .b8 6    // DW_AT_bit_size
 ; Negative offset must be encoded as an unsigned integer.
-; CHECK-NEXT: .b64 0xffffffffffffffff // DW_AT_bit_offset
+; CHECK-NEXT: .b8 127  // DW_AT_bit_offset
 ; CHECK-NEXT: .b8 2    // DW_AT_data_member_location
 
 %struct.anon = type { i16 }

diff  --git a/llvm/test/DebugInfo/X86/packed_bitfields.ll b/llvm/test/DebugInfo/X86/packed_bitfields.ll
index 0e541f09d2270..614fa59c36784 100644
--- a/llvm/test/DebugInfo/X86/packed_bitfields.ll
+++ b/llvm/test/DebugInfo/X86/packed_bitfields.ll
@@ -15,7 +15,7 @@
 ; CHECK-NOT: DW_TAG_member
 ; CHECK:      DW_AT_byte_size  {{.*}} (0x01)
 ; CHECK-NEXT: DW_AT_bit_size   {{.*}} (0x06)
-; CHECK-NEXT: DW_AT_bit_offset {{.*}} (0xffffffffffffffff)
+; CHECK-NEXT: DW_AT_bit_offset {{.*}} (-1)
 ; CHECK-NEXT: DW_AT_data_member_location {{.*}} ({{.*}}0x0{{0*}})
 
 ; ModuleID = 'repro.c'