[Lldb-commits] [PATCH] D108233: WIP: Add minidump save-core functionality to ELF object files

Fri Aug 27 10:52:19 PDT 2021

clayborg requested changes to this revision.
clayborg added a comment.
This revision now requires changes to proceed.

No worries on only saving out the minimal for now, but just make sure that you error out if CoreStyle is anything but "eSaveCoreUnspecified" or "eSaveCoreStackOnly".

================
Comment at: lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp:519
+lldb_private::Status
+MinidumpFileBuilder::AddMemoryList(const lldb::ProcessSP &process_sp) {
+  Status error;
----------------
Aj0SK wrote:
> Aj0SK wrote:
> > clayborg wrote:
> > > This should take in the CoreStyle and only emit what was asked for.
> > > 
> > > If style is set to "eSaveCoreStackOnly", then grab only the memory regions for each thread stack pointer and save only those. 
> > > 
> > > We can't tell from LLDB APIs if a page is dirty, so if we get "eSaveCoreDirtyOnly", then we will need to save all memory regions that have write permissions.
> > > 
> > > If it is either "eSaveCoreFull"  or "eSaveCoreUnspecified" then save everything.
> > I agree that this code should take care of a CoreStyle but it poses a significant problem to this implementation as it was mainly designed to save smaller Minidumps. This solution stores all of the information in memory before dumping them (this eases an implementation quite a bit because of the way how Minidump pointers (offsets) are working). This implementation, on my machine, exhausted memory before the minidump could be written in case of a full memory dump. At this time, we don't plan to reimplement this solution in the way it would allow to store all the data on disc at the time of minidump creation so there are two possible solutions that I see:
> > 
> > 1. If the type of the CoreStyle is full or dirty, this plugin will return false from the SaveCore function. Then maybe the default CoreStyle for this plugin should be changed to "eSaveCoreStackOnly".
> > 
> > 2. To the best of my knowledge, it should be theoretically possible to reimplement Dump() method to sort of take special care of a MemoryListStream, dumping also the memory information at the end of the minidump file as I think the memory dump is the most stressful for the memory and otherwise there is no problem with this.
> To state my preference, I would rather stick to the first option and landed a minimum viable product. The only reason for this is that I have this version way better tested and also I am not entirely sure about how minidump deals with the whole memory dumps as it can be a little problematic for a big memory chunks... Then, I would rather add the necessary changes in the next patch...
That is fine. I was just looking at the implementation below and I think I might know why this was taking up too much space. I will comment below, and maybe this will fix the issues of storing way too much info in the miniump

================
Comment at: lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp:522
+  MemoryRegionInfo range_info;
+  error = process_sp->GetMemoryRegionInfo(0, range_info);
+
----------------
FYI: memory region for address zero often has no permissions, so there is no need to actually try and save this memory. For any memory region be sure the check if it has permissions using:

```
if (region_info.GetLLDBPermissions() != 0)
```
before trying to read or save this off. I will comment below

================
Comment at: lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp:530
+  // Get interesting addresses
+  std::vector<size_t> interesting_addresses;
+  auto thread_list = process_sp->GetThreadList();
----------------
Should we have a std::set<lldb::addr_t> to store the base of any regions that have already been saved? Many PC values could be in the same regions (like when many threads are waiting in syscalls).

================
Comment at: lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp:544-546
+  while (range_info.GetRange().GetRangeBase() != LLDB_INVALID_ADDRESS) {
+    const addr_t addr = range_info.GetRange().GetRangeBase();
+    const addr_t size = range_info.GetRange().GetByteSize();
----------------
Why are we checking the region_info for address zero here? The first time through this loop, it will contain information for address zero which will be a region that has no permissions. I suggest a new main loop in the comment below which simplifies this entire logic and also makes sure we don't emit the same region more than once.

================
Comment at: lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp:549
+    bool is_interesting = false;
+    for (size_t interesting_address : interesting_addresses)
+      if (interesting_address >= addr && interesting_address < addr + size) {
----------------
Shouldn't this be the main loop that would replace the "while (range_info.GetRange().GetRangeBase() != LLDB_INVALID_ADDRESS)" loop? I would think that the main loop wiould be something like:

```
std::set<addr_t> visited_region_base_addresses;
for (size_t interesting_address : interesting_addresses) {
  error = process_sp->GetMemoryRegionInfo(interesting_address, range_info);
  // Skip failed memory region requests or any regions with no permissions.
  if (error.Fail() || range_info.GetPermissions() == 0)
    continue;
  const addr_t addr = range_info.GetRange().GetRangeBase();
  // Skip any regions we have already saved out.
  if (visited_region_base_addresses.insert(addr).second == false)
    continue;
  const addr_t size = range_info.GetRange().GetByteSize();
  if (size == 0)
    continue;
  auto data_up = std::make_unique<DataBufferHeap>(size, 0);
  const size_t bytes_read = process_sp->ReadMemory(addr, data_up->GetBytes(), size, error);
  if (bytes_read == 0)
    continue;
  // We have a good memory region with valid bytes to store.
  LocationDescriptor memory_dump;
  memory_dump.DataSize = static_cast<llvm::support::ulittle32_t>(size);
  memory_dump.RVA =static_cast<llvm::support::ulittle32_t>(GetCurrentDataEndOffset());
  MemoryDescriptor memory_desc;
  memory_desc.StartOfMemoryRange = static_cast<llvm::support::ulittle64_t>(addr);
  memory_desc.Memory = memory_dump;
  mem_descriptors.push_back(memory_desc);
  m_data.AppendData(data_up->GetBytes(), bytes_read);
}
```

================
Comment at: lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp:551
+      if (interesting_address >= addr && interesting_address < addr + size) {
+        is_interesting = true;
+        break;
----------------
Make sure the memory region has any permissions. This returns a logic OR bitmask of lldb::Permissions bits:
```
enum Permissions {
  ePermissionsWritable = (1u << 0),
  ePermissionsReadable = (1u << 1),
  ePermissionsExecutable = (1u << 2)};
};
```
No permissions will return 0;

================
Comment at: lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp:568
+      auto data_up = std::make_unique<DataBufferHeap>(size, 0);
+      process_sp->ReadMemory(addr, data_up->GetBytes(), size, error);
+
----------------
You should listen to the return value from this ReadMemory call and only write this many bytes to the memory region. See my above proposed full loop source code abiove, as it reads the memory first before creating the LocationDescriptor and MemoryDescriptor and it skips to the next address in the loop if things go bad during memory read. It also stores the number if bytes that were actually read in case the process memory read comes back with fewer or no bytes.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108233/new/

https://reviews.llvm.org/D108233