[Lldb-commits] [PATCH] D98822: [lldb] [Process] Watch for fork/vfork notifications

Pavel Labath via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Tue Mar 30 05:15:05 PDT 2021


labath added a comment.

In D98822#2646052 <https://reviews.llvm.org/D98822#2646052>, @mgorny wrote:

> It includes refactoring clone/fork/vfork event monitoring into a single function. Right now, only forks are supported — the handler creates a local `NativeProcessLinux` instance (@labath, any suggestions how to make this less hacky?), duplicates parent's breakpoints into it (just like the child gets parent's memory) and uses it to clear these breakpoints and detach the child process.

I don't find this particularly hacky. However, I have doubts about the breakpoint clearing aspect. If we add that now, I think it will be tricky to move that process to the client in the future (how will the client know whether the breakpoints were auto-removed?) Given that  breakpoints in forked process have been borked since forever, I don't see harm in keeping that broken for a little while longer. Even a client with extremely limited of multiple processes (like the one we had in mind) should not have a problem with sending the appropriate `z` packets before it detaches. This patch could still come first (it's great that you've separated that out) -- it just needs to ensure that there are no breakpoints in the child path which would need clearing.

Btw, have you by any chance looked at how gdb determines which clone()s constitute a new thread?



================
Comment at: lldb/include/lldb/Host/common/NativeProcessProtocol.h:145
 
+  // Remove all software breakpoints and return a vector of breakpoint data
+  // that can be used to readd them.
----------------
mgorny wrote:
> mgorny wrote:
> > Long term, these functions are probably unnecessary. It just occurred to me to check how GDB handles it, and it handles clearing and restoring breakpoints on client side (i.e. by sending `z` and `Z` packets).
> Hmm, actually that depends on how much compatibility with old client versions we want to preserve. If we want forks/vforks to stop being broken with breakpoints when using an old client, we need to keep them as a fallback logic for when fork/vfork-events aren't supported by client.
I certainly don't think we need to try to "fix" older clients by doing stuff to the server. (And on top of that, I am less worried about people using older clients, than I am about older servers -- those can be harder to update.)


================
Comment at: lldb/include/lldb/Host/common/NativeProcessProtocol.h:401-404
+  struct Extension {
+    static constexpr uint32_t fork = 1;
+    static constexpr uint32_t vfork = 2;
+  };
----------------
mgorny wrote:
> labath wrote:
> > The llvm way to do this is via an enum class in combination with LLVM_MARK_AS_BITMASK_ENUM.
> Heh, and you've just told me to use `constexpr`s instead of `enum`s ;-).
That's cause you were using an enum to define constants. If you were using it to define a type, then I would be fine (though I might ask to make it a scoped enum).


================
Comment at: lldb/source/Host/linux/Host.cpp:323
+
+  if (!GetStatusInfo(tid, process_info, state, tracerpid, tgid) || tgid == 0)
+    return llvm::None;
----------------
Why is this checking for zero ? (When it's being set to INVALID_PID above?)


================
Comment at: lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp:924
+    auto pgid_ret = getPIDForTID(child_pid);
+    if (!pgid_ret || pgid_ret.getValue() != child_pid) {
+      // A new thread should have PGID matching our process' PID.
----------------
`if (pgid_ret != child_pid)` ?


================
Comment at: lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp:456
+      LLDB_LOG(log, "tid {0} belongs to a different tgid {1}, assuming child",
+               pid, tgid.getValue());
+      MonitorFork(pid, false, 0);
----------------
mgorny wrote:
> labath wrote:
> > mgorny wrote:
> > > For some reason, this doesn't print the correct tgid value (plain printf works). Any clue what's wrong here?
> > what does it print?
> `tid 118029 belongs to a different tgid 4295085325, assuming child`
> 
> When tgid was 0, it also printed this huge number. If I cast it to `int`, it prints fine. But then, first arg is also `lldb::pid_t` and it prints fine.
I dunno. I recommend debugging it. It smells like some kind of undefined behavior.


================
Comment at: lldb/test/Shell/Subprocess/fork-follow-parent.test:6
+process launch
+# CHECK: function run in child
+# CHECK-NOT: function run in parent
----------------
mgorny wrote:
> The tests sometimes fail if child starts after the breakpoint is hit (and therefore this check happens before stop reason below). Any suggestion how to make it work independently of where 'function run in child' happens?
There are several ways around that, but I don't know which one is the best, as I'm not sure what exactly you're trying to test. You could try to capture this nondeterminism with CHECK-DAGs, but CHECK-DAGs don't combine well with CHECK-NOTs. Another option is to make this predictable by adding synchronization to the inferior. If you just wanted to check that the child process finishes correctly, you could have the parent process check its exit status and match that...


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98822/new/

https://reviews.llvm.org/D98822



More information about the lldb-commits mailing list