[Lldb-commits] [PATCH] D145624: [lldb] Make MemoryCache::Read more resilient

Thu Mar 9 11:25:54 PST 2023

bulbazord added inline comments.

================
Comment at: lldb/source/Target/Memory.cpp:133-141
   // Check the L1 cache for a range that contain the entire memory read. If we
-  // find a range in the L1 cache that does, we use it. Else we fall back to
-  // reading memory in m_L2_cache_line_byte_size byte sized chunks. The L1
-  // cache contains chunks of memory that are not required to be
+  // find a range in the L1 cache that does, we use it. If not:
+  //  - dst_len > L2 cache line size: Attempt to read the entire memory chunk
+  //    and insert whatever we get back into the L1 cache.
+  //  - dst_len <= L2 cache line size: Attempt to read it from the L2 cache, and
+  //    if its not there, attempt to populate the cache.
+  // The L1 cache contains chunks of memory that are not required to be
----------------
JDevlieghere wrote:
> Instead of describing the algorithm here, would it make sense to break this up and put it above the relevant code below? It seems like it matches pretty well with the code structure. Looking at the signature of `FindEntryThatContains` and the fact that it doesn't take the size, I assume it's because we only check the start address? 
Good point!

================
Comment at: lldb/source/Target/Memory.cpp:149
+  if (m_invalid_ranges.FindEntryThatContains(addr)) {
+    error.SetErrorStringWithFormat("memory read failed for 0x%" PRIx64, addr);
+    return 0;
----------------
JDevlieghere wrote:
> Should the error mention that if failed due to the address overlapping with an invalid range? Is an invalid range something that is meaningful to the user? 
I don't think the concept of "invalid range" is meaningful to the user right now. I'm pretty sure we only use it to prevent us from reading __PAGEZERO on apple platforms.

================
Comment at: lldb/source/Target/Memory.cpp:181-183
+    // FIXME: We should be able to wrap this into the check near the beginning
+    // of this function but we don't check for overlapping ranges so we will
+    // additionally check each cache line we read.
----------------
clayborg wrote:
> Not on the FIXME: We can't really check this near the beginning, because this happens for each cache line we as we advance the "curr_cache_line_base_addr" right? 
> 
> One thing to note about this code is that we might need to read at most 2 cache lines for any requests that make it to this code since we check above for "if (dst_len > m_L2_cache_line_byte_size)..." and use the L1 cache if that is true. So we know that we will read at most 2 cache lines depending on the offset. Might be nice to read the 2 cache lines in one memory access below if possible, and then make two cache entries with the result, but it will be either one cache line read, or two
If I'm understanding what you mean correctly, I think we can check this near the beginning. We have all the information we need to do safety checks before we even start reading anything, I believe...?

Also, because we read at most 2 cache lines, I can probably get rid of the loop and just do 2 sequential reads...

================
Comment at: lldb/source/Target/Memory.cpp:196-198
+      // If we didn't fill out the cache line completely and the offset is
+      // bigger than what we have available, we can't do anything further
+      // here.
----------------
clayborg wrote:
> JDevlieghere wrote:
> > Why can't we read from the process? Same question below.
> Because if we did a memory request before from a valid "curr_cache_line_base_addr", and we got back fewer bytes that requested, then the bytes won't be available later right?
If we've hit this code path then we have previously read from the process and got back fewer bytes than a cache line fits. For example, maybe a cache line is 512 bytes and when we performed the read we got back 502 bytes for some reason. If we're trying to read the last 10 bytes of that line, that's just not available, so we bail out. We could try to protect against this in a number of ways, like if we get back fewer bytes than we wanted initially then maybe we can retry the read before caching the line, or if the line isn't filled out maybe can try to read the inferior one more time or something.

Ultimately, I want `MemoryCache` to be prepared for reads to be incomplete and guard against touching memory that we don't have.

================
Comment at: lldb/source/Target/Memory.cpp:200
+      if (cache_offset >= cached_line_size)
+        return dst_len - bytes_left;

----------------
mib wrote:
> Shouldn't this be the size of the cached line ?
To be honest, I'm somewhat sure that this line actually could be `return 0;`. I'm pretty sure that we only will hit this on the first cache line read. If we read a second cache line, we should always be starting at the beginning of the cache line... I'll probably refactor this.

================
Comment at: lldb/source/Target/Memory.cpp:210
+      bytes_left -= curr_read_size;
+      curr_cache_line_base_addr += cache_line_byte_size;
+      cache_offset = 0;
----------------
mib wrote:
> IIUC, now that the read succeeded, you're shifting the current cache line base address to point to the next cache line base address, so you can continue reading if necessary. I had troubles understanding the point of this line before reading the rest of the code so either this should move closer to where it's used or at least it should have a comment explaining what it's doing.
I should have paid closer attention to this line. We should only be moving the cache line base address to the next cache line if we're going to continue reading. Will refactor.

================
Comment at: lldb/source/Target/Memory.cpp:245
+      assert((curr_cache_line_base_addr % cache_line_byte_size) == 0);
+      std::unique_ptr<DataBufferHeap> data_buffer_heap_up(
+          new DataBufferHeap(cache_line_byte_size, 0));
----------------
clayborg wrote:
> Should we just create a DataBufferSP right away here instead of creating a unique pointer and releasing it later?
That sounds like a smarter move.

================
Comment at: lldb/source/Target/Memory.cpp:255-256

-      // We need to read from the process
+      if (process_bytes_read < cache_line_byte_size)
+        data_buffer_heap_up->SetByteSize(process_bytes_read);

----------------
clayborg wrote:
> If we don't read an entire cache line, should we populate this into the L1 cache instead? It might make the logic for accessing data in the L2 cache a bit simpler?
That might not be a bad idea actually. I'll try it and see how much it simplifies the logic.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145624/new/

https://reviews.llvm.org/D145624