[Lldb-commits] [PATCH] D66249: [JIT][Breakpoint] Add "BreakpointInjectedSite" and FCB Trampoline

Mon Aug 26 02:41:58 PDT 2019

labath added a subscriber: xiaobai.
labath added a comment.

In D66249#1643594 <https://reviews.llvm.org/D66249#1643594>, @mib wrote:

> In D66249#1642316 <https://reviews.llvm.org/D66249#1642316>, @labath wrote:
>
> > A bunch more comments from me. So far, I've only tried to highlight the most obvious problems or style issues.
> >
> >   It's not clear to me whether this is still prototype code or not... If it isn't, I'll have a bunch more.. :)
>
>
> This is prototype code that I'm trying to upstream to build upon.
>
> From the comments I already gathered, I started thinking of a better implementation of this feature (maybe v2).
>  There are still some parts that need do be done, such as resolving the variables on optimized code or patching dynamically the copied instructions.
>
> But, I'd like to have a baseline that is upstream to start working on these parts.
>
> **All feedback is welcome !** :)

Ok, I see where you're going. I think we have a different interpretation of the word "prototype" . When I say "prototype code", I mean: the changes that I've made while experimenting, to try out whether a particular approach is feasible, and which I'm sharing with other people to get early feedback about the direction I am going in. These changes may include various shortcuts or hacks, which I've made to speed up the development of this prototype, but I understand that these will have to be removed before the final code is submitted (in fact, the final patch may often be a near-complete reimplementation of the prototype) . In this light, the phrase "prototype which I am trying to upstream" is a bit of an oxymoron since by the time it is upstreamed, the code should no longer be a prototype. It does not mean that the code should support every possible use case right from the start, but it should follow best coding, design and testing practices, regardless of any "experimental", "prototype" or other label.

With the eventual upstreaming goal in mind, I've tried to make a more thorough pass over this patch. The comments range from minute style issues like using `(void)` in function declarations, which can be trivially fixed, to layering concerns that may require more serious engineering work. Overall, my biggest concern is that there is no proper encapsulation in this patch. Despite being advertised as "fully extensible" this patch bakes in a lot of knowledge in very generic pieces of code. For instance, the `BreakpointInjectedSite` class is riddled with OS (`mach_vm_allocate`), architecture (`rbp`) and language (`Builtin.int_trap()`) specific knowledge -- it shouldn't need to know about none of those. I think this is the biggest issue that needs to be resolved before this patchset is ready for upstreaming. I've also made a number of other suggestions (here and in other threads) about possible alternative courses implementations. These are somewhat subjective and so I am not expecting you to implement all of them. However, I would:
a) expect some sort of a reply saying why you chose to do one thing or the other
b) strongly encourage you to try them out because I believe they will make fixing some of the other issues easier (for instance, switching to allocating the memory on the stack would mean that we can completely avoid the "how to allocate memory on different OSs" discussion, and just deal with architecture specifics, since subtracting from the stack pointer works the same way on every OS).

Also, this talk of a `v2` makes me worried that this code will become deprecated/unmaintained the moment it hits the repository. Lldb has some bad experience with fire-and-forget features, and the prospect of this makes me want to be even stricter about any bad patterns I notice in the code. IIUC, you are now working on a different take on this feature, which kind of means that you yourself are not sure about whether this is the best approach here. I understand the need for comparison, but why does one of these things need to be in the main repo for you to be able to compare them?

================
Comment at: lldb/include/lldb/Breakpoint/BreakpointInjectedSite.h:175
+  ///     The source code needed to copy the variable in the argument structure.
+  std::string ParseDWARFExpression(size_t index, Status &error);
+
----------------
Return `Expected<string>` instead of a string+error combo.

================
Comment at: lldb/include/lldb/Breakpoint/BreakpointSite.h:54-56
+   static bool classof(const BreakpointSite *bp_site) {
+     return bp_site->getKind() == eKindBreakpointSite;
+   }
----------------
mib wrote:
> labath wrote:
> > mib wrote:
> > > labath wrote:
> > > > This is weird. So, in OO terminology, `BreakpointInjectedSite` "is-a" `BreakpointSite`, but if you ask `llvm::isa<BreakpointSite>(injected_site)`, it will return false?
> > > No, it returns **true**.
> > Then does `llvm::isa<BreakpointInjectedSite>(injected_site)` return false? They can't both return true, given the current implementations (but they should, if the dynamic type of `injected_site` really is `BreakpointInjectedSite`...
> `llvm::isa<BreakpointSite>(injected_site) ` & `llvm::isa<BreakpointInjectedSite>(injected_site)` both return **true**.
I'm sorry, I was wrong. It is indeed possible for both of these expressions to return true. However, I was also right in saying this implementation is incorrect :), because the reason that `llvm::isa<BreakpointSite>(injected_site)` does work is because this function does not get invoked at all -- that expression hits the upcast special case/optimization in the `isa` implementation <https://github.com/llvm-mirror/llvm/blob/master/include/llvm/Support/Casting.h#L62>, which does not invoke the `classof` method when the cast can be statically proven "correct".

So, this function is dead code and should be removed.

================
Comment at: lldb/source/Breakpoint/BreakpointInjectedSite.cpp:43
+
+  for (BreakpointLocationSP loc_sp : m_owners.BreakpointLocations()) {
+
----------------
It looks like this entire loop is wrong for the case when you have more than one breakpoint location -- you'll print `static int hit_count` twice, the second time inside the `if` statement. And you'll just accumulate weird stuff inside the `trap` variable. 

================
Comment at: lldb/source/Breakpoint/BreakpointInjectedSite.cpp:60-80
+    if (language == eLanguageTypeSwift) {
+      trap += "Builtin.int_trap()";
+    } else if (Language::LanguageIsCFamily(language)) {
+      trap = "__builtin_debugtrap()";
+    } else {
+      LLDB_LOG(log, "FCB: Language {} not supported",
+               Language::GetNameForLanguageType(language));
----------------
This encodes a lot of language specifics, which probably don't belong in a generic class like this. I guess this should somehow go through the ExpressionParser plugin, but I am super familiar with how those work. @xiaobai, do you have any thoughts on that?

================
Comment at: lldb/source/Breakpoint/BreakpointInjectedSite.cpp:72
+    condition_text += "static int hit_count = 0;\n\thit_count++;\n\t";
+    condition_text += (single_condition) ? "if (" : " || ";
+    condition_text += condition;
----------------
I was confused when reading this because I thought this refers to the overall count of the conditions, not the fact whether you're processing the first condition or not. I'd rename this to `first_condition` to make it clearer that its value can change between loop iterations (`single_condition` sounds too immutable -- either there is just one condition, or there are many of them).

Another pattern you can consider is having a single `condition_prefix` variable, and printing the conditions via something like:
```
prefix = "if (";
for (c: conditions) {
  stream << prefix << c.getText();
  prefix = " || ";
}
```
I generally find this pattern cleaner, but that is somewhat a matter of personal taste.

================
Comment at: lldb/source/Breakpoint/BreakpointInjectedSite.cpp:141
+
+  void *buffer = std::calloc(jit_addr_range.GetByteSize(), sizeof(uint8_t));
+
----------------
Is this memory freed anywhere? You should really use a safer memory management construct here. It could be as simple as `vector<uint8_t>`

================
Comment at: lldb/source/Breakpoint/BreakpointInjectedSite.cpp:157-180
+  if (!platform_sp) {
+    error.SetErrorString("Couldn't get running platform");
+    return false;
+  }
+
+  if (!platform_sp->GetSoftwareBreakpointTrapOpcode(*m_target_sp.get(), this)) {
+    error.SetErrorString("Couldn't get current architecture trap opcode");
----------------
The error handling here seems to be very inconsistent. Sometimes you just set the error string (but don't do anything with it), another time you log something.

Overall, I'd recommend preferring returning actual (llvm::)errors instead of a bool+log combo because it's shorter, and enables one to decide what to do with the errors higher up (e.g. it may be possible to display the error to the user in the future).

================
Comment at: lldb/source/Breakpoint/BreakpointInjectedSite.cpp:246
+  ClangUserExpression *clang_expr =
+      llvm::dyn_cast<ClangUserExpression>(m_condition_expression_sp.get());
+
----------------
This also looks like an improper core code -> plugin dependency.

================
Comment at: lldb/source/Breakpoint/BreakpointInjectedSite.cpp:274
+    if (!value) {
+      LLDB_LOG(log, "FCB: Couldn't find value for element {}/{}", i,
+               num_elements);
----------------
llvm format strings require position specifiers, so this needs to be `{0}/{1}`

================
Comment at: lldb/source/Breakpoint/BreakpointInjectedSite.cpp:430-432
+    case DW_OP_fbreg: {
+      int64_t operand = op.getRawOperand(0);
+      expr += "   src_addr = (void*) (regs->rbp + " + std::to_string(operand) +
----------------
Note that `DW_OP_fbreg` does not mean that that the location is relative to the rbp (frame pointer register). It is relative to whatever the debug info says it's the base of this functions frame via DW_AT_frame_base. So, that could be `rsp+constant`, or anything else really... I think it's fine to not support any other frame base than a plain `rbp`, but it would probably be good to detect that and abort if the frame is not rbp-based.

================
Comment at: lldb/source/Breakpoint/BreakpointOptions.cpp:589
     if (level != eDescriptionLevelBrief) {
+      std::string fast = (m_inject_condition) ? " (FAST)" : "";
       s->EOL();
----------------
This is pretty minute, but I don't remember seeing this style of writing the ternary operator anywhere else in lldb or llvm codebase (i.e., wrapping the condition inside `()` even when it is a single token). It would be better to avoid that for consistency.

================
Comment at: lldb/source/Plugins/ABI/SysV-x86_64/ABISysV_x86_64.cpp:234-328
+  size_t context_size = X64_SAVED_REGS + X64_VOLATILE_REGS;
+
+  size_t expected_trampoline_size = context_size; // Save registers
+  expected_trampoline_size += X64_MOV_SIZE;       // Pass register addr as arg
+  expected_trampoline_size += X64_CALL_SIZE;      // Create arg struct
+  expected_trampoline_size += X64_MOV_SIZE;       // Pass arg struct to jit expr
+  expected_trampoline_size += X64_CALL_SIZE;      // Call jit expr
----------------
This code would be much simpler if you just used some kind of a stream to build up the assembly. You'd avoid needing to compute the size of the buffer in advance, allocating a variable-length array (non-standard extension), and having a sanity double check at the end.

================
Comment at: lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionDeclMap.h:307

+  ExpressionVariableList &GetStructMembers(void) { return m_struct_members; }
+
----------------
no `(void)`

================
Comment at: lldb/source/Target/Process.cpp:1701
+        if (!bp_injected_site_sp->BuildConditionExpression()) {
+          error = "FCB: Couldn't build the condition expression";
+          return FallbackToRegularBreakpointSite(owner, use_hardware, log,
----------------
Why stuff the string inside a `std::string`, if you're just going to call `.c_str()` on it anyway?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66249/new/

https://reviews.llvm.org/D66249