[PATCH] [RFC] Use llvm-symbolizer to symbolize LLVM/Clang crash dumps

Alexey Samsonov vonosmas at gmail.com
Fri Oct 3 16:10:16 PDT 2014


================
Comment at: lib/Support/Unix/Signals.inc:284
@@ +283,3 @@
+  DlIteratePhdrData *data = (DlIteratePhdrData*)arg;
+  const char *name = (data->first) ? data->main_exec_name : info->dlpi_name;
+  data->first = false;
----------------
dblaikie wrote:
> parens around data->first seem unnecessary
Done

================
Comment at: lib/Support/Unix/Signals.inc:287
@@ +286,3 @@
+  for (int i = 0; i < info->dlpi_phnum; i++) {
+    const ElfW(Phdr) *phdr = &info->dlpi_phdr[i];
+    if (phdr->p_type == PT_LOAD) {
----------------
dblaikie wrote:
> Well that's some syntax I haven't seen before... I assume it's valid/compiles?
> 
> I would've expected to see that written as:
> 
>   const EelfW (*phdr)(Phdr) = ...
> 
> but if what you've got there works... great... just surprising.
ElfW() is a macro, which here expands ElfW(Phdr) to Elf32_Phdr, or Elf64_Phdr.

================
Comment at: lib/Support/Unix/Signals.inc:288
@@ +287,3 @@
+    const ElfW(Phdr) *phdr = &info->dlpi_phdr[i];
+    if (phdr->p_type == PT_LOAD) {
+      intptr_t beg = info->dlpi_addr + phdr->p_vaddr;
----------------
dblaikie wrote:
> could be an early continue instead
Done

================
Comment at: lib/Support/Unix/Signals.inc:294
@@ +293,3 @@
+        intptr_t addr = (intptr_t)data->StackTrace[j];
+        if (beg <= addr && addr < end) {
+          data->modules[j] = name;
----------------
dblaikie wrote:
> could be an early continue, but is of questionable benefit (same number of lines of code, only two lines to unindent)
Left as is (IMO negative condition for "lies in the interval" is less clear).

================
Comment at: lib/Support/Unix/Signals.inc:301
@@ +300,3 @@
+  }
+  return 0;
+}
----------------
dblaikie wrote:
> The return value seems to be always zero and unused. Remove it
This function is passed as a callback into dl_iterate_phdr, so we must have "int" as a return value here.

================
Comment at: lib/Support/Unix/Signals.inc:326
@@ +325,3 @@
+  SmallString<32> InputFile, OutputFile;
+  sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile);
+  sys::fs::createTemporaryFile("symbolizer-output", "", OutputFile);
----------------
dblaikie wrote:
> hmm - should we not be just reading the output stream straight from the process via a pipe, rather than writing it to disk and reading it back from there?
> 
> Maybe we don't have portable primitives for that. (maybe there's no such common/portable primitive)
Yeah, unfortunately sys::ExecuteAndWait and friends only accepts files, not file descriptors for redirects. I can work on adding an overload (though, using files proved to be convenient while debugging this code).

================
Comment at: lib/Support/Unix/Signals.inc:331
@@ +330,3 @@
+  std::vector<const StringRef *> Redirects(3, nullptr);
+  auto InputFileStr = llvm::make_unique<const StringRef>(InputFile);
+  auto OutputFileStr = llvm::make_unique<const StringRef>(OutputFile);
----------------
dblaikie wrote:
> Could these just be straight StringRefs, rather than unique_ptr<StringRef>?
sys::ExecuteAndWait accepts (const StringRef ** as its arguments). That's sort of... ugly. It also distinguishes between nullptr (= redirect is not set up) and empty StringRef (= redirect to /dev/null).

================
Comment at: lib/Support/Unix/Signals.inc:348
@@ +347,3 @@
+  int RunResult =
+      sys::ExecuteAndWait(LLVMSymbolizerPath, args, nullptr, Redirects.data());
+  if (RunResult != 0)
----------------
dblaikie wrote:
> just an aside: but why is symbolication out-of-process in this way? (I mean I can think of the obvious "Because the process is already in a bad state, we don't want to do lots of work in that process for fear the bad state would somehow corrupt all that work" - but I'm not sure if that's the case & how much it applies when we're still dynamically allocating several structures here so we can call llvm-symbolizer and print the result, etc)
If we do an in-process symbolization, we'd need to link in both LLVMDebugInfo and LLVMObject into every LLVM tool/program that uses this signal handler (ew). I believe that Clang, for instance, doesn't (and shouldn't) depend on DWARF parser at the moment. This was the main reason.

I agree that using vectors/strings and a bunch of heavy methods from LLVMSupport in signal handler can potentially cause some problems, but, oh, not doing so would make the code so much worse. Still, I think that the amout of work we're doing here is significantly smaller than mapping large binaries and maintaining lots of internal structures for representing DWARF that's done in DebugInfo.

================
Comment at: lib/Support/Unix/Signals.inc:364
@@ +363,3 @@
+      // encounter empty line.
+      for (;;) {
+        StringRef FunctionName = *CurLine++;
----------------
dblaikie wrote:
> So this loop is where inlining is handled? (multiple results from llvm-symbolizer for a single query, terminated by a blank line?)
Yes.

================
Comment at: lib/Support/Unix/Signals.inc:369
@@ +368,3 @@
+        fprintf(FD, "#%d %p ", frame_no++, StackTrace[i]);
+        if (!FunctionName.startswith("??")) {
+          fprintf(FD, "%s ", FunctionName.str().c_str());
----------------
dblaikie wrote:
> usually skip the braces on single line blocks
Done

================
Comment at: lib/Support/Unix/Signals.inc:375
@@ +374,3 @@
+          fprintf(FD, "%s", FileLineInfo.str().c_str());
+        } else {
+          fprintf(FD, "(%s+%p)", modules[i], (void *)offsets[i]);
----------------
dblaikie wrote:
> More braces to skip (if and else)
Done

================
Comment at: lib/Support/Unix/Signals.inc:381
@@ +380,3 @@
+    } else {
+      fprintf(FD, "#%d %p\n", frame_no++, StackTrace[i]);
+    }
----------------
dblaikie wrote:
> This could be an early continue and reduce indentation
Done

http://reviews.llvm.org/D5610






More information about the llvm-commits mailing list