[Lldb-commits] [PATCH] D66791: [lldb][ELF] Read symbols from .gnu_debugdata sect.

Fri Aug 30 01:33:31 PDT 2019

labath added a comment.

In D66791#1651300 <https://reviews.llvm.org/D66791#1651300>, @kwk wrote:

> Currently the test `lldb/packages/Python/lldbsuite/test/linux/minidebuginfo/TestMiniDebugInfo.py` fails if `LLVM_ENABLE_LZMA` is not defined because the test stupidly assumes that support for minidebuginfo is available.

For this you'll need to make sure that the information about whether lzma is found or not is piped into the test suite. Take a look at `lldb/lit/lit.cfg.py` to see how it's done for python for instance. Then you can have a `REQUIRES: lzma` test which checks that the section is decompressed and a `REQUIRES: !lzma` test which ensures the warning gets printed.

I'd also like to see the yaml test resurrected. I find that one to be pretty important as it has nearly to system dependencies, and so it can run anywhere. If you don't want to tackle the yaml2obj thing now, you can always remove the extra symtab section via objcopy.

================
Comment at: lldb/include/lldb/Host/LZMA.h:24-30
+llvm::Error getUncompressedSize(const uint8_t *compressedBuffer,
+                                uint64_t compressedBufferSize,
+                                uint64_t &uncompressedSize);
+
+llvm::Error uncompress(const uint8_t *compressedBuffer,
+                       uint64_t compressedBufferSize, uint8_t *uncompressedData,
+                       uint64_t uncompressedSize);
----------------
How about an API like `llvm::Error uncompress(ArrayRef<uint8_t> InputBuffer, SmallVectorImpl<uint8_t> &Uncompressed)`, where the `uncompress` function automatically figures out the appropriate size of the buffer, and resizes it accordingly?

 As it stands now, this interface is neither easy to use, nor consistent with the llvm API. The above signature will make your use case easy, while still (slightly) improving consistency with the llvm API. If it makes anything easier, feel free to replace `uint8_t` with `char`, and `ArrayRef` with `StringRef` -- that would improve consistency, but I thing that the usage of StringRef/char in that API is a mistake that just keeps propagating itself.

================
Comment at: lldb/packages/Python/lldbsuite/test/linux/minidebuginfo/Makefile:1-55
+LEVEL = ../../make
+C_SOURCES := main.c
+
+all: clean binary
+
+.PHONY: binary
+binary: a.out
----------------
All of this would be much easier if you wrote it as a lit test.
```
// RUN: %clangxx %s -o %t
// RUN: llvm-nm ...
// ...
// RUN: %lldb %t -o "br set -n multiplyByThree" -o run -o exit | FileCheck %s

// CHECK: Breakpoint 1 hit
```

================
Comment at: lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp:1849-1851
+  if (m_gnuDebugDataObjectFile != nullptr) {
+    return m_gnuDebugDataObjectFile;
+  }
----------------
Not a big deal, but we usually don't put braces around short blocks like this. If the block spans multiple lines, then the braces are fine, even if the block technically consists of a single statement only.

================
Comment at: lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp:1866
+  // Determine size of uncompressed data
+  auto data = DataExtractor();
+  section->GetSectionData(data);
----------------
`DataExtractor data`. llvm does not "always use auto".

================
Comment at: lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp:1868-1881
+  uint64_t uncompressedSize = 0;
+  auto err = lldb_private::lzma::getUncompressedSize(data.GetDataStart(), data.GetByteSize(), uncompressedSize);
+  if (err) {
+    GetModule()->ReportWarning(
+        "Failed to get size of uncompressed LZMA buffer from section %s: %s",
+        section->GetName().AsCString(), Status(std::move(err)).AsCString());
+    return nullptr;
----------------
With the new API, this would be something like
```
SmallVector<uint8_t> uncompressed;
if (llvm::Error error = lzma::uncompress(data.GetData(), uncompressed) {
  GetModule()->ReportWarning(""An error occured while decompression the section %s: %s", section->GetName().AsCString(), llvm::toString(std::move(error)).c_str());
  return nullptr;
}

================
Comment at: lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp:1901
+  ArchSpec spec = m_gnuDebugDataObjectFile->GetArchitecture();
+  if (spec && m_gnuDebugDataObjectFile->SetModulesArchitecture(spec)) {
+    return m_gnuDebugDataObjectFile;
----------------
Are you sure you want to do that? Presumably, the architecture of the module was already set by the containing object file, and that file probably has a more correct understanding of what the right architecture is...

================
Comment at: lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp:2711-2731
     Section *symtab =
         section_list->FindSectionByType(eSectionTypeELFSymbolTable, true).get();
-    if (!symtab) {
-      // The symtab section is non-allocable and can be stripped, so if it
-      // doesn't exist then use the dynsym section which should always be
-      // there.
-      symtab =
-          section_list->FindSectionByType(eSectionTypeELFDynamicSymbols, true)
-              .get();
-    }
     if (symtab) {
       m_symtab_up.reset(new Symtab(symtab->GetObjectFile()));
       symbol_id += ParseSymbolTable(m_symtab_up.get(), symbol_id, symtab);
     }

----------------
Let's put this bit into a separate patch. As I said over IRC, I view this bit as functionally independent of the debugdata stuff (it's definitely independent in it's current form, as it will kick in for non-debugdata files too, which I think is fine).

================
Comment at: lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.h:395
+
+  std::shared_ptr<ObjectFileELF> m_gnuDebugDataObjectFile;
 };
----------------
move this next to other instance variables.

================
Comment at: lldb/source/Plugins/SymbolVendor/ELF/SymbolVendorELF.cpp:79
+  // If there's none in the orignal object file, we add it to it.
+  if (auto gdd_obj_file =
+          obj_file->GetGnuDebugDataObjectFile()) {
----------------
kwk wrote:
> labath wrote:
> > Wouldn't it be better to first check for the external file, and then fall back to gnu_debugdata, as the external file will likely have more complete debug info?
> My idea was to load whatever symbols we can get and let it be overwritten by the the more concrete ones that might come later. Changing the logic requires to your suggesttion would require a bit more effort in that I cannot simply leave the `return nullptr` expressions untouched. 
Aha, interesting. I missed the fact that you are continuing the search after extracting the debugdata section. While I don't expect that will be too useful, it does sound like a good idea in principle.

The thing that worries me in this case is precisely the `return nullptr`. That value is supposed to mean that the symbol vendor hasn't done anything, and lldb should continue the search. But in this case, you actually *have* done something. If that's how you want to implement this, then I think it would be better if this logic was fully inside ObjectFileELF. I think you should be able to achieve that by just moving this bit of code into ObjectFileELF::CreateSections. That way, the contents of gnu_debugdata would be considered an indivisible part of the main object file (which is kind of true), and anything that the symbol vendor finds in the external files (which is his main goal) would come on top of that.

================
Comment at: llvm/CMakeLists.txt:53-361
+include(CMakeDependentOption)
+
 if (NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)
   message(STATUS "No build type selected, default to Debug")
   set(CMAKE_BUILD_TYPE "Debug" CACHE STRING "Build type (default Debug)" FORCE)
 endif()

----------------
Move this stuff into lldb's cmake file (lldb/cmake/modules/LLDBConfig.cmake, probably).

================
Comment at: llvm/include/llvm/Support/MathExtras.h:271
   unsigned char out[sizeof(Val)];
-  std::memcpy(in, &Val, sizeof(Val));
+  memcpy(in, &Val, sizeof(Val));
   for (unsigned i = 0; i < sizeof(Val); ++i)
----------------
kwk wrote:
> labath wrote:
> > Huh?
> I did get this compile error before:
> 
> 
> ```
> compile error: no memcpy in std ns
> 
> /home/kkleine/llvm/llvm/include/llvm/Support/MathExtras.h:271:8: error: no member named 'memcpy' in namespace 'std'; did you mean 'wmemcpy'?
> 
> std::memcpy(in, &Val, sizeof(Val));
> ~~~~~^~~~~~
>      wmemcpy
> ```
> 
> So I "fixed" it along the way. And according to http://www.cplusplus.com/reference/cstring/memcpy/?kw=memcpy, there's no `std::memcpy`
There is according to `https://en.cppreference.com/w/cpp/string/byte/memcpy`. (I generally find cppreference.com to be a more reliable source of information cplusplus.com). I think this needs more investigation and a separate patch, as right now you seem to be the only person that has a problem with this line of code. I think it's more likely there some misconfiguration issue on your side than an llvm bug.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66791/new/

https://reviews.llvm.org/D66791