[LLVMdev] [patch] Prototype/proof-of-concept for DWARF type units

Tue Oct 1 19:45:16 PDT 2013

Hi Dave,

On Mon, Sep 30, 2013 at 2:34 PM, David Blaikie <dblaikie at gmail.com> wrote:
> This isn't a realistic/viable implementation, just a hacked up "can I make
> it produce the right output" kind of thing, but while I hammer out a few
> more details (like fixing MC to allow multiple sections with the same name
> but different comdat groups) I figured I'd throw it out there to have a bit
> of a chat about it.
>

This is definitely a good start.

> I've tested simple cases of a single type and they seem to work OK. Multiple
> types from the same translation unit won't work at the moment due to the MC
> issue.

OK.

>
> The practical details are fairly simple. When CompileUnit goes to create a
> type, first it hands it off to DwarfDebug to decide whether it's going to be
> in a type unit. If it is, DwarfDebug spins up a new CompileUnit that's
> really a type unit, emits the type into that unit (creating all dependent
> types and other paraphernalia), attaches the signature (using the ODR hash
> as a stand-in for now) and offset to the CompileUnit and the signature to
> the DIE the original CompileUnit provided (along with any other DIEs that
> needed that signature and were discovered along the way).
>
> Then emit it as normal - with a few special cases to add the necessary extra
> headers, offsets, etc.
>

This is definitely how I was thinking this would work, yes.

> A realistic implementation probably involves moving unit header handling
> down into CompileUnit, extracting most of the common functionality into a
> Unit type and having CompileUnit and TypeUnit types to handle the header
> differences. Sound about right?
>

Sounds good to me here.

> I haven't got to looking at how this is meant to (or currently does)
> interact with fission, but I suspect it might be pretty close 'for free'.

It should be orthogonal yes.

Few comments on random aspects of the patch:

@@ -789,7 +797,7 @@ DIE *CompileUnit::getOrCreateTypeDIE(const MDNode *TyNode) {
     return TyDIE;

   // Create new type.
-  TyDIE = new DIE(dwarf::DW_TAG_base_type);
+  TyDIE = new DIE(Ty.getTag());
   insertDIE(Ty, TyDIE);
   if (Ty.isBasicType())
     constructTypeDIE(*TyDIE, DIBasicType(Ty));
@@ -911,13 +919,10 @@ void CompileUnit::constructTypeDIE(DIE &Buffer,
DIBasicType BTy) {
   if (!Name.empty())
     addString(&Buffer, dwarf::DW_AT_name, Name);

-  if (BTy.getTag() == dwarf::DW_TAG_unspecified_type) {
-    Buffer.setTag(dwarf::DW_TAG_unspecified_type);
-    // An unspecified type only has a name attribute.
+  // An unspecified type only has a name attribute.
+  if (BTy.getTag() == dwarf::DW_TAG_unspecified_type)
     return;
-  }

-  Buffer.setTag(dwarf::DW_TAG_base_type);
   addUInt(&Buffer, dwarf::DW_AT_encoding, dwarf::DW_FORM_data1,
           BTy.getEncoding());

Probably a fine separate cleanup.

@@ -930,12 +935,11 @@ void CompileUnit::constructTypeDIE(DIE &Buffer,
DIDerivedType DTy) {
   // Get core information.
   StringRef Name = DTy.getName();
   uint64_t Size = DTy.getSizeInBits() >> 3;
-  uint16_t Tag = DTy.getTag();

   // FIXME - Workaround for templates.
-  if (Tag == dwarf::DW_TAG_inheritance) Tag = dwarf::DW_TAG_reference_type;
-
-  Buffer.setTag(Tag);
+  if (Buffer.getTag() == dwarf::DW_TAG_inheritance)
+    Buffer.setTag(dwarf::DW_TAG_reference_type);
+  uint16_t Tag = DTy.getTag();

Ditto.

Extra credit: Wat?

@@ -966,6 +970,8 @@ static bool isTypeUnitScoped(DIType Ty, const
DwarfDebug *DD) {
     // Don't generate a hash for anything scoped inside a function.
     if (Parent.isSubprogram())
       return false;
+    if (Parent.isNameSpace() && DINameSpace(Parent).getName().empty())
+      return false;
     Parent = DD->resolve(Parent.getContext());
   }
   return true;

Ditto. :)

@@ -973,6 +979,9 @@ static bool isTypeUnitScoped(DIType Ty, const
DwarfDebug *DD) {

 /// Return true if the type should be split out into a type unit.
 static bool shouldCreateTypeUnit(DICompositeType CTy, const DwarfDebug *DD) {
+  if (CTy.getName().empty())
+    return false;
+
   uint16_t Tag = CTy.getTag();

   switch (Tag) {

This too.

@@ -983,22 +992,29 @@ static bool shouldCreateTypeUnit(DICompositeType
CTy, const DwarfDebug *DD) {
     // If this is a class, structure, union, or enumeration type
     // that is not a declaration, is a type definition, and not scoped
     // inside a function then separate this out as a type unit.
-    if (CTy.isForwardDecl() || !isTypeUnitScoped(CTy, DD))
-      return 0;
-    return 1;
+    return !CTy.isForwardDecl() && isTypeUnitScoped(CTy, DD);
   default:
-    return 0;
+    return false;
   }
 }

and again. For some reason while I thought we preferred 1/0 in llvm to
true/false, it also just isn't consistent so fixing would be good.

-  uint16_t getLanguage() const { return DICompileUnit(Node).getLanguage(); }
+  uint16_t getLanguage() const { return Language; }

One more.

@@ -332,16 +344,17 @@ public:
   void addPubTypes(DISubprogram SP);

   /// constructTypeDIE - Construct basic type die from DIBasicType.
-  void constructTypeDIE(DIE &Buffer,
-                        DIBasicType BTy);
+  void constructTypeDIE(DIE &Buffer, DIBasicType BTy);

   /// constructTypeDIE - Construct derived type die from DIDerivedType.
-  void constructTypeDIE(DIE &Buffer,
-                        DIDerivedType DTy);
+  void constructTypeDIE(DIE &Buffer, DIDerivedType DTy);

   /// constructTypeDIE - Construct type DIE from DICompositeType.
-  void constructTypeDIE(DIE &Buffer,
-                        DICompositeType CTy);
+  void constructTypeDIE(DIE &Buffer, DICompositeType CTy);
+

and again.

-static cl::opt<bool>
-GenerateODRHash("generate-odr-hash", cl::Hidden,
-                cl::desc("Add an ODR hash to external type DIEs."),
-                cl::init(false));

Probably don't want to remove this option yet. The basic functionality
still has some use for an ODR checker.

+bool DwarfDebug::addTypeUnitType(DIE *RefDie, DICompositeType CTy) {
+  if (!GenerateTypeUnits)
+    return false;

I'd rather see this check in shouldCreateTypeUnit and an assert here. Thoughts?

+    // If we've got multiple CUs, we might've emitted the same type in several,
+    // we could deduplicate them here rather than relying on the linker to do
+    // so.

This is the case (or had better be) for linking multiple TUs with
llvm, though I'm not sure what the comment is for really.

+    // The only reason addUint et. al. are members of CompileUnit is
because the
+    // DIEValueAllocator is a member of the CompileUnit - perhaps we
should move
+    // that out to DwarfDebug instead of having one per CompileUnit.

Yes please.

+    InfoHolder.computeSizeAndOffset(
+        UnitDie, sizeof(uint32_t) + sizeof(uint16_t) + sizeof(uint32_t) +
+                     sizeof(int8_t) + sizeof(uint64_t) + sizeof(uint32_t));

I realize this is just a prototype, but since I'm noticing - it'd be
nice to have what's what in the header that you're adding here. :)

-  void emitUnits(DwarfDebug *DD, const MCSection *USection,
+  void emitUnits(DwarfDebug *DD, const MCObjectFileInfo &FileInfo, bool Split,
                  const MCSection *ASection, const MCSymbol *ASectionSym);

Seems like a good change. Might be able to remove ASection here as
well? Not being able to remove the Sym is unfortunate. Might make
sense to add the section syms to the MC layer.

In general it looks great and a lot of good cleanups too. Thanks for
all the work here.

-eric