[PATCH] Emit DWARF info for all code section in an assembly file

David Blaikie dblaikie at gmail.com
Thu Apr 3 09:28:46 PDT 2014


  General comments:

  Test coverage:
  1) a test case with more than one section, but only one section with instructions (it'd be nice for this not to use ranges, but just high_pc/low_pc)
  2) a test case with more than one section, but at least two sections with instructions and at least one section without (just to make sure we can omit those sections while still including a ranges section with the other sections)

  The refactoring into a single data structure containing a mapping from section to start/end symbols would be nice.

  Probably a bridge too far, but this starts to look almost, sort of, like what we'd like to have for source level debugging handling too. It might be nice to consider a generic range list representation that can handle all the ranges and emission. Though source level debugging needs the concept of multiple ranges even in a single section, which is an added complication compared to this code.


================
Comment at: include/llvm/MC/MCContext.h:132
@@ +131,3 @@
+    /// generating the debug_aranges section
+    DenseMap<const MCSection*, MCSymbol*> GenDwarfSectionStartSyms;
+    DenseMap<const MCSection*, MCSymbol*> GenDwarfSectionEndSyms;
----------------
Having 3 data structures all with the same key and related values is a bit of a pity. Could we get a struct (or even just a pair) of start/end symbol and have one map to that pair?

I assume "GenDwarfSections" is then just the key set from that map, right? So you don't need a separate data structure for that as well, you can just use the presence/absence in the map

Also - I assume the iteration order over the keys is important for stability of the output of the ranges section, so this might need to be a MapVector rather than just a DenseMap?

================
Comment at: include/llvm/MC/MCContext.h:386
@@ +385,3 @@
+        return Sym->second;
+      return NULL;
+    }
----------------
I think we can move to using nullptr these days - could just fix this in all places in your patch, but not strictly necessary I suppose.

================
Comment at: include/llvm/MC/MCContext.h:382
@@ +381,3 @@
+    void addGenDwarfSection(const MCSection *Sec) { GenDwarfSections.insert(Sec); }
+    MCSymbol *getGenDwarfSectionStartSym(const MCSection* Sec) {
+      DenseMap<const MCSection *, MCSymbol*>::iterator Sym = GenDwarfSectionStartSyms.find(Sec);
----------------
With the suggested data structure changes above, these get/set functions probably just turn into one or two functions (two if you need const overloads, one if you don't) that, given an MCSection* returns the pair/structure containing the start and end.

================
Comment at: lib/MC/MCDwarf.cpp:525
@@ +524,3 @@
+    MCOS->EmitLabel(SectionEndSym);
+    context.setGenDwarfSectionEndSym(*sec, SectionEndSym);
+
----------------
Perhaps rather than creating end labels, etc, for sections that don't have any instructions anyway, and using another data structure, we could just remove the element from the Sections set/map immediately - then iterate the map later on?

================
Comment at: lib/MC/MCDwarf.cpp:658
@@ +657,3 @@
+    while (true) {
+      assert(TextSection != Sections.end() && "No text section found");
+      MCOS->SwitchSection(*TextSection);
----------------
Would this break on an empty asm file (or one with only data, perhaps?)?

================
Comment at: lib/MC/MCDwarf.cpp:826
@@ +825,3 @@
+  std::set<const MCSection*> Sections = MCOS->getContext().getGenDwarfSections();
+  for (std::set<const MCSection*>::const_iterator sec = Sections.begin();
+       sec != Sections.end(); sec++) {
----------------
Again, seems easier to just remove the sections that aren't relevant (& I'm not sure why this pass and the other one that checks for instructions and builds another data structure and closes ranges is separate - perhaps once the sections are just removed there won't be a need for two passes)

================
Comment at: lib/MC/MCDwarf.cpp:874
@@ +873,3 @@
+  // to do anything
+  if (MCOS->getContext().getGenDwarfSections().empty())
+    return;
----------------
Seems like this could go in separately - if we have null begin/end data today (when we track only one range) we should do the same thing, right? Be good to just add that check (then rephrase it to the empty test here in the following commit) and test case first, possibly, maybe... 

================
Comment at: lib/MC/MCDwarf.cpp:824
@@ +823,3 @@
+    MCOS->SwitchSection(*sec);
+    if (!MCOS->hasInstructions()) continue;
+    MCOS->SwitchSection(context.getObjectFileInfo()->getDwarfRangesSection());
----------------
So this code would result in us possibly emitting a ranges section with only one range/section, right? (if there were multiple sections but only one turns out to have instructions)

That can probably be tidied up later, though.

================
Comment at: lib/MC/MCDwarf.cpp:823
@@ +822,3 @@
+       sec != Sections.end(); sec++) {
+    MCOS->SwitchSection(*sec);
+    if (!MCOS->hasInstructions()) continue;
----------------
This SwitchSection call seems bogus/dangerous - if we ever leave the loop via the continue on the next line, then we'll emit the range terminator into the section we were just looking at, not the range section.

================
Comment at: include/llvm/MC/MCContext.h:396
@@ +395,3 @@
+               GenDwarfSections.begin();
+           sec != GenDwarfSections.end(); ++sec) {
+        MCOS.SwitchSection(*sec);
----------------
If you do the "hasInstructions" check here then you could avoid creating end symbols and just remove the elements from the container entirely - that would address the issue of creating ranges with only one range later on, right?

================
Comment at: lib/MC/MCParser/ELFAsmParser.cpp:557
@@ +556,3 @@
+          getContext().getGenDwarfSections().size() > 1)
+        Error(loc, "DWARF2 only supports one section per compilation unit");
+    }
----------------
What does GCC do here? (silently fall back to just describing the .text section?)

================
Comment at: lib/MC/MCParser/ELFAsmParser.cpp:553
@@ +552,3 @@
+  if (getContext().getGenDwarfForAssembly()) {
+    if (!getContext().getGenDwarfSections().count(ELFSection)) {
+      getContext().addGenDwarfSection(ELFSection);
----------------
Rather than doing a count and then an insertion, you should just be able to insert and use the bool in the returned iterator/bool pair to determine if this is the first time the section has been added to the list.

================
Comment at: lib/MC/MCParser/ELFAsmParser.cpp:560
@@ +559,3 @@
+
+    if (!getContext().getGenDwarfSectionStartSym(ELFSection)) {
+      MCSymbol *SectionStartSymbol = getContext().CreateTempSymbol();
----------------
Shouldn't this condition be the same as the one above (ie: if the element isn't in the sections list it also doesn't have a start sym?) Seems unnecessary to do two lookups. (& based on the suggested data structure change this would be one insertion, check whether the insertion succeeded (because the element wasn't already present), then provide the necessary error and/or create the start symbol)

pair = sections.insert(make_pair(section, range()));
if (pair.second) {
  if (dwarf <= 2 && sections > 1)
    error
  start = createTempSymbol
  emit(start);
  pair.first->start = start;
}

Something like that anyway.


http://llvm-reviews.chandlerc.com/D2697



More information about the llvm-commits mailing list