[PATCH] D43286: [CodeGen] Generate DWARF v5 Accelerator Tables

Thu Mar 8 08:21:40 PST 2018

labath added a comment.

I should also call out that this does not support type units and it does not represent the info that was present in the apple_objc accelerator tables. For the first part, that's mainly because we don't generate dwarf5-compatible type units right now. For the second item, it's because it was not clear to me how the apple_objc table maps to debug_names. It was fairly obvious that the other three tables should be merged into one, but it is not clear to me how to map the objc one, as it is doing something a bit strange: it uses the class name as a lookup key, but it points to DIEs of the individual member functions.

================
Comment at: lib/CodeGen/AsmPrinter/AccelTable.cpp:15
 #include "llvm/CodeGen/AccelTable.h"
+#include "DwarfCompileUnit.h"
 #include "llvm/ADT/STLExtras.h"
----------------
JDevlieghere wrote:
> should be on top (clang-format should do this for you IIRC)
clang-format puts AccelTable.h on top because that's the header this cpp file implements. The reason this looks weird is because DwarfCompileUnit is in the lib folder, so I have to use the local include  path. Otherwise, this would be down below with the rest of llvm/CodeGen includes. The other files in this folder have the same style as well.

================
Comment at: lib/CodeGen/AsmPrinter/AccelTable.cpp:193
+    uint32_t AugmentationStringSize = sizeof(AugmentationString);
+    char AugmentationString[8] = {'L', 'L', 'V', 'M', '0', '7', '0', '0'};
+    static_assert(sizeof(AugmentationString) % 4 == 0, "");
----------------
JDevlieghere wrote:
> labath wrote:
> > aprantl wrote:
> > > JDevlieghere wrote:
> > > > Hmm, we should probably move this elsewhere and have Cmake generate it for us? This seems hard to maintain and quite a burden, wouldn't you agree?
> > > What does the '0700' derive from?
> > If we always want to put the current version here, then that would definitely be a good thing to do. However, it wasn't clear to me whether we want to do that, or only bump the version when we make a substantial change in the information we generate (I hardcoded it because I assumed the latter). However, maybe we shouldn't even follow the llvm versions, but put some completely independent strings here ? I'm also starting to think the format should be more human-readable (e.g. "LLVM 7.0.0" instead of "LLVM0700").
> > 
> > Basically, I have no idea what would be a good thing to put here, because I don't know how (or if) will we end up using it. I am open to any suggestions...
> If we only want to change this when making changes to the format we'd probably want to use something other than the compiler version? Also, I'm guessing the last 0 is the null terminator, so that means we cant' change it for minor versions (which probably we shouldn't anyway, but who knows). Counting from zero/one also seems weird. 
> 
> Anyway I think it's good to have this, just not sure what to put there. I'll sleep on it and get back to you tomorrow :-) 
The last 0 is not the null terminator (we don't need it as we specify the length explicitly). But in any case, there's no upper bound on the augmentation string length. It only needs to be a multiple of four.

================
Comment at: lib/CodeGen/AsmPrinter/AccelTable.cpp:403
+    dwarf::Form Form;
+    if (CompUnits.size() <= 0x100)
+      Form = dwarf::DW_FORM_data1;
----------------
aprantl wrote:
> should this be factored out?
I've put the form calculation it into a static function. Or did you mean something more fancy?

================
Comment at: lib/CodeGen/AsmPrinter/AccelTable.cpp:408
+    else {
+      assert(CompUnits.size() < 0x1000000ull && "Too many compilation units");
+      Form = dwarf::DW_FORM_data4;
----------------
labath wrote:
> aprantl wrote:
> > This should probably be a hard error rather than an assertion. We don't want the backend to be exploitable by malformed input more than necessary :-)
> How do I do a "hard error"? (Also, technically, I don't think this should lead to UB, only to truncation of certain values (and generation of broken tables).)
I've replaced this with a call to abort(). Does that count as a hard error?
BTW, the reason why we cannot describe more than this many CUs is because of the CompUnitCount field, which is 32-bit (even for DWARF64). Right now we don't generate DWARF64, and for DWARF32 with this many compilation units, I would expect things to break way before we reach this part.

================
Comment at: lib/CodeGen/AsmPrinter/AccelTable.cpp:469
+  switch (Form) {
+  case dwarf::DW_FORM_data1:
+    Size = 1;
----------------
JDevlieghere wrote:
> labath wrote:
> > aprantl wrote:
> > > Don't we have this somewhere else already and if not, should it be factored into a common utility function?
> > We have something in the DebugInfo library, but I haven't seen anything in CodeGen. I'll have another look tomorrow...
> I've seen similar code quite a bit the last few weeks but always in different parts which made it hard to reuse. Maybe it's time to put something as general as possible in Dwarf.def? The problem is usually that there's no fixed length for (U)LEB128 encoded forms.
It seems it should be fairly easy to move the DWARFFormParams class and the static DWARFFormValue::getFixedByteSize function somewhere into the BinaryFormat library. Jonas, can you give me pointers to the other places which do form size calculations?

================
Comment at: test/DebugInfo/Generic/debug-names-hash-collisions.ll:30
+; CHECK: Bucket 0
+; CHECK:     Hash: 0xF8CF70D
+; CHECK:     String: 0x{{[0-9a-f]*}} "_ZN4lldb7SBBlockaSERKS0_"
----------------
aprantl wrote:
> CHECK-NEXT?
I've added check-next where it's possible right now (the String: line immediately follows the Hash: line). I can't use it everywhere without adding expectations about the stuff I don't care about for this test.  I'm not sure where you were heading with this, but if the issue was to make sure the Buckets don't contain any extra names, i've added some CHECK-NOT lines to prevent that.

Repository:
  rL LLVM

https://reviews.llvm.org/D43286