[Lldb-commits] [lldb] 74a5175 - [lldb] Make order of completions for expressions deterministic and sorted by Clang's priority values.

Wed May 27 10:23:03 PDT 2020

Author: Raphael Isemann
Date: 2020-05-27T19:22:01+02:00
New Revision: 74a51753a6c2c587f650174e19f99279e8e4ef35

URL: https://github.com/llvm/llvm-project/commit/74a51753a6c2c587f650174e19f99279e8e4ef35
DIFF: https://github.com/llvm/llvm-project/commit/74a51753a6c2c587f650174e19f99279e8e4ef35.diff

LOG: [lldb] Make order of completions for expressions deterministic and sorted by Clang's priority values.

Summary:

It turns out that the order in which we provide completions for expressions is
nondeterministic. This leads to confusing user experience and also breaks the
reproducer tests (as two LLDB tests can go out of sync due to the
non-determinism in the completion lists)

The reason for the non-determinism is that the CompletionConsumer informs us
about decls in the order in which it finds declarations in the lookup store of
the DeclContexts it visits (mainly this snippet in SemaLookup.cpp):

``` lang=c++
    // Enumerate all of the results in this context.
    for (DeclContextLookupResult R :
         Load ? Ctx->lookups()
              : Ctx->noload_lookups(/*PreserveInternalState=*/false)) {
       [...]
```

This storage of the lookup is sorted by pointer values (see the hash of
`DeclarationName`) and can therefore be non-deterministic. The LLDB code
completion consumer that receives these calls originally expected that the order
of declarations is defined by Clang, but it seems the API expects the client to
provide an order to the completions.

This patch fixes the issue as follows:

* We sort the completions we get from Clang alphabetically and also by the
priority value we get from Clang (with priority value sorting having precedence
over the alphabetical sorting)

* We make all the functions/variables that touch a completion before the sorting
const-qualified. The idea is that this should prevent that we never have
observable side-effect from touching these declarations in a non-deterministic
order (e.g., we don't try to complete the type by accident).

This way we behave like the other parts of Clang which also sort the results by
some deterministic value (usually the name or something computed from a name,
e.g., edit distance to a given string).

We most likely also need to fix the Clang code to make the loop I listed above
deterministic to prevent these issues in the future (tracked in rdar://63442513
). This wouldn't replace the functionality provided in this patch though as we
would still need the priority and overall alphabetical sorting.

Note: I had to increase the lldb-vscode completion limit to 100 as the tests
look for strings that aren't in the first 50 results anymore due to variable
names starting with letters like 'v' (which are now always shown much further
down in the list due to the alphabetical sorting).

Fixes rdar://63200995

Reviewers: JDevlieghere, clayborg

Reviewed By: JDevlieghere

Subscribers: mgrang, abidh

Differential Revision: https://reviews.llvm.org/D80292

Added: 
    

Modified: 
    lldb/packages/Python/lldbsuite/test/lldbtest.py
    lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionParser.cpp
    lldb/test/API/commands/expression/completion/TestExprCompletion.py
    lldb/tools/lldb-vscode/lldb-vscode.cpp

Removed: 
    


################################################################################
diff  --git a/lldb/packages/Python/lldbsuite/test/lldbtest.py b/lldb/packages/Python/lldbsuite/test/lldbtest.py
index 639f99463d92..0dee4f217c80 100644

--- a/lldb/packages/Python/lldbsuite/test/lldbtest.py
+++ b/lldb/packages/Python/lldbsuite/test/lldbtest.py
@@ -2147,13 +2147,27 @@ def match(
 
         return match_object
 
-    def check_completion_with_desc(self, str_input, match_desc_pairs):
+    def check_completion_with_desc(self, str_input, match_desc_pairs, enforce_order=False):
+        """
+        Checks that when the given input is completed at the given list of
+        completions and descriptions is returned.
+        :param str_input: The input that should be completed. The completion happens at the end of the string.
+        :param match_desc_pairs: A list of pairs that indicate what completions have to be in the list of
+                                 completions returned by LLDB. The first element of the pair is the completion
+                                 string that LLDB should generate and the second element the description.
+        :param enforce_order: True iff the order in which the completions are returned by LLDB
+                              should match the order of the match_desc_pairs pairs.
+        """
         interp = self.dbg.GetCommandInterpreter()
         match_strings = lldb.SBStringList()
         description_strings = lldb.SBStringList()
         num_matches = interp.HandleCompletionWithDescriptions(str_input, len(str_input), 0, -1, match_strings, description_strings)
         self.assertEqual(len(description_strings), len(match_strings))
 
+        # The index of the last matched description in description_strings or
+        # -1 if no description has been matched yet.
+        last_found_index = -1
+        out_of_order_errors = ""
         missing_pairs = []
         for pair in match_desc_pairs:
             found_pair = False
@@ -2162,20 +2176,35 @@ def check_completion_with_desc(self, str_input, match_desc_pairs):
                 description_candidate = description_strings.GetStringAtIndex(i)
                 if match_candidate == pair[0] and description_candidate == pair[1]:
                     found_pair = True
+                    if enforce_order and last_found_index > i:
+                        new_err = ("Found completion " + pair[0] + " at index " +
+                                  str(i) + " in returned completion list but " +
+                                  "should have been after completion " +
+                                  match_strings.GetStringAtIndex(last_found_index) +
+                                  " (index:" + str(last_found_index) + ")\n")
+                        out_of_order_errors += new_err
+                    last_found_index = i
                     break
             if not found_pair:
                 missing_pairs.append(pair)
 
+        error_msg = ""
+        got_failure = False
         if len(missing_pairs):
-            error_msg = "Missing pairs:\n"
+            got_failure = True
+            error_msg += "Missing pairs:\n"
             for pair in missing_pairs:
                 error_msg += " [" + pair[0] + ":" + pair[1] + "]\n"
+        if len(out_of_order_errors):
+            got_failure = True
+            error_msg += out_of_order_errors
+        if got_failure:
             error_msg += "Got the following " + str(num_matches) + " completions back:\n"
             for i in range(num_matches + 1):
                 match_candidate = match_strings.GetStringAtIndex(i)
                 description_candidate = description_strings.GetStringAtIndex(i)
-                error_msg += "[" + match_candidate + ":" + description_candidate + "]\n"
-            self.assertEqual(0, len(missing_pairs), error_msg)
+                error_msg += "[" + match_candidate + ":" + description_candidate + "] index " + str(i) + "\n"
+            self.assertFalse(got_failure, error_msg)
 
     def complete_exactly(self, str_input, patterns):
         self.complete_from_to(str_input, patterns, True)

diff  --git a/lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionParser.cpp b/lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionParser.cpp
index 8885cbc85b2c..14dd0656bf82 100644
--- a/lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionParser.cpp
+++ b/lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionParser.cpp
@@ -666,11 +666,33 @@ class CodeComplete : public CodeCompleteConsumer {
 
   std::string m_expr;
   unsigned m_position = 0;
-  CompletionRequest &m_request;
   /// The printing policy we use when printing declarations for our completion
   /// descriptions.
   clang::PrintingPolicy m_desc_policy;
 
+  struct CompletionWithPriority {
+    CompletionResult::Completion completion;
+    /// See CodeCompletionResult::Priority;
+    unsigned Priority;
+
+    /// Establishes a deterministic order in a list of CompletionWithPriority.
+    /// The order returned here is the order in which the completions are
+    /// displayed to the user.
+    bool operator<(const CompletionWithPriority &o) const {
+      // High priority results should come first.
+      if (Priority != o.Priority)
+        return Priority > o.Priority;
+
+      // Identical priority, so just make sure it's a deterministic order.
+      return completion.GetUniqueKey() < o.completion.GetUniqueKey();
+    }
+  };
+
+  /// The stored completions.
+  /// Warning: These are in a non-deterministic order until they are sorted
+  /// and returned back to the caller.
+  std::vector<CompletionWithPriority> m_completions;
+
   /// Returns true if the given character can be used in an identifier.
   /// This also returns true for numbers because for completion we usually
   /// just iterate backwards over iterators.
@@ -687,7 +709,7 @@ class CodeComplete : public CodeCompleteConsumer {
   /// Drops all tokens in front of the expression that are unrelated for
   /// the completion of the cmd line. 'unrelated' means here that the token
   /// is not interested for the lldb completion API result.
-  StringRef dropUnrelatedFrontTokens(StringRef cmd) {
+  StringRef dropUnrelatedFrontTokens(StringRef cmd) const {
     if (cmd.empty())
       return cmd;
 
@@ -708,7 +730,7 @@ class CodeComplete : public CodeCompleteConsumer {
   }
 
   /// Removes the last identifier token from the given cmd line.
-  StringRef removeLastToken(StringRef cmd) {
+  StringRef removeLastToken(StringRef cmd) const {
     while (!cmd.empty() && IsIdChar(cmd.back())) {
       cmd = cmd.drop_back();
     }
@@ -719,7 +741,7 @@ class CodeComplete : public CodeCompleteConsumer {
   /// existing command. Returns the completion string that can be returned to
   /// the lldb completion API.
   std::string mergeCompletion(StringRef existing, unsigned pos,
-                              StringRef completion) {
+                              StringRef completion) const {
     StringRef existing_command = existing.substr(0, pos);
     // We rewrite the last token with the completion, so let's drop that
     // token from the command.
@@ -741,11 +763,10 @@ class CodeComplete : public CodeCompleteConsumer {
   /// \param[out] position
   ///    The character position of the user cursor in the `expr` parameter.
   ///
-  CodeComplete(CompletionRequest &request, clang::LangOptions ops,
-               std::string expr, unsigned position)
+  CodeComplete(clang::LangOptions ops, std::string expr, unsigned position)
       : CodeCompleteConsumer(CodeCompleteOptions()),
         m_info(std::make_shared<GlobalCodeCompletionAllocator>()), m_expr(expr),
-        m_position(position), m_request(request), m_desc_policy(ops) {
+        m_position(position), m_desc_policy(ops) {
 
     // Ensure that the printing policy is producing a description that is as
     // short as possible.
@@ -758,9 +779,6 @@ class CodeComplete : public CodeCompleteConsumer {
     m_desc_policy.Bool = true;
   }
 
-  /// Deregisters and destroys this code-completion consumer.
-  ~CodeComplete() override {}
-
   /// \name Code-completion filtering
   /// Check if the result should be filtered out.
   bool isResultFilteredOut(StringRef Filter,
@@ -788,6 +806,85 @@ class CodeComplete : public CodeCompleteConsumer {
     return true;
   }
 
+private:
+  /// Generate the completion strings for the given CodeCompletionResult.
+  /// Note that this function has to process results that could come in
+  /// non-deterministic order, so this function should have no side effects.
+  /// To make this easier to enforce, this function and all its parameters
+  /// should always be const-qualified.
+  /// \return Returns llvm::None if no completion should be provided for the
+  ///         given CodeCompletionResult.
+  llvm::Optional<CompletionWithPriority>
+  getCompletionForResult(const CodeCompletionResult &R) const {
+    std::string ToInsert;
+    std::string Description;
+    // Handle the 
diff erent completion kinds that come from the Sema.
+    switch (R.Kind) {
+    case CodeCompletionResult::RK_Declaration: {
+      const NamedDecl *D = R.Declaration;
+      ToInsert = R.Declaration->getNameAsString();
+      // If we have a function decl that has no arguments we want to
+      // complete the empty parantheses for the user. If the function has
+      // arguments, we at least complete the opening bracket.
+      if (const FunctionDecl *F = dyn_cast<FunctionDecl>(D)) {
+        if (F->getNumParams() == 0)
+          ToInsert += "()";
+        else
+          ToInsert += "(";
+        raw_string_ostream OS(Description);
+        F->print(OS, m_desc_policy, false);
+        OS.flush();
+      } else if (const VarDecl *V = dyn_cast<VarDecl>(D)) {
+        Description = V->getType().getAsString(m_desc_policy);
+      } else if (const FieldDecl *F = dyn_cast<FieldDecl>(D)) {
+        Description = F->getType().getAsString(m_desc_policy);
+      } else if (const NamespaceDecl *N = dyn_cast<NamespaceDecl>(D)) {
+        // If we try to complete a namespace, then we can directly append
+        // the '::'.
+        if (!N->isAnonymousNamespace())
+          ToInsert += "::";
+      }
+      break;
+    }
+    case CodeCompletionResult::RK_Keyword:
+      ToInsert = R.Keyword;
+      break;
+    case CodeCompletionResult::RK_Macro:
+      ToInsert = R.Macro->getName().str();
+      break;
+    case CodeCompletionResult::RK_Pattern:
+      ToInsert = R.Pattern->getTypedText();
+      break;
+    }
+    // We also filter some internal lldb identifiers here. The user
+    // shouldn't see these.
+    if (llvm::StringRef(ToInsert).startswith("$__lldb_"))
+      return llvm::None;
+    if (ToInsert.empty())
+      return llvm::None;
+    // Merge the suggested Token into the existing command line to comply
+    // with the kind of result the lldb API expects.
+    std::string CompletionSuggestion =
+        mergeCompletion(m_expr, m_position, ToInsert);
+
+    CompletionResult::Completion completion(CompletionSuggestion, Description,
+                                            CompletionMode::Normal);
+    return {{completion, R.Priority}};
+  }
+
+public:
+  /// Adds the completions to the given CompletionRequest.
+  void GetCompletions(CompletionRequest &request) {
+    // Bring m_completions into a deterministic order and pass it on to the
+    // CompletionRequest.
+    llvm::sort(m_completions);
+
+    for (const CompletionWithPriority &C : m_completions)
+      request.AddCompletion(C.completion.GetCompletion(),
+                            C.completion.GetDescription(),
+                            C.completion.GetMode());
+  }
+
   /// \name Code-completion callbacks
   /// Process the finalized code-completion results.
   void ProcessCodeCompleteResults(Sema &SemaRef, CodeCompletionContext Context,
@@ -806,59 +903,11 @@ class CodeComplete : public CodeCompleteConsumer {
         continue;
 
       CodeCompletionResult &R = Results[I];
-      std::string ToInsert;
-      std::string Description;
-      // Handle the 
diff erent completion kinds that come from the Sema.
-      switch (R.Kind) {
-      case CodeCompletionResult::RK_Declaration: {
-        const NamedDecl *D = R.Declaration;
-        ToInsert = R.Declaration->getNameAsString();
-        // If we have a function decl that has no arguments we want to
-        // complete the empty parantheses for the user. If the function has
-        // arguments, we at least complete the opening bracket.
-        if (const FunctionDecl *F = dyn_cast<FunctionDecl>(D)) {
-          if (F->getNumParams() == 0)
-            ToInsert += "()";
-          else
-            ToInsert += "(";
-          raw_string_ostream OS(Description);
-          F->print(OS, m_desc_policy, false);
-          OS.flush();
-        } else if (const VarDecl *V = dyn_cast<VarDecl>(D)) {
-          Description = V->getType().getAsString(m_desc_policy);
-        } else if (const FieldDecl *F = dyn_cast<FieldDecl>(D)) {
-          Description = F->getType().getAsString(m_desc_policy);
-        } else if (const NamespaceDecl *N = dyn_cast<NamespaceDecl>(D)) {
-          // If we try to complete a namespace, then we can directly append
-          // the '::'.
-          if (!N->isAnonymousNamespace())
-            ToInsert += "::";
-        }
-        break;
-      }
-      case CodeCompletionResult::RK_Keyword:
-        ToInsert = R.Keyword;
-        break;
-      case CodeCompletionResult::RK_Macro:
-        ToInsert = R.Macro->getName().str();
-        break;
-      case CodeCompletionResult::RK_Pattern:
-        ToInsert = R.Pattern->getTypedText();
-        break;
-      }
-      // At this point all information is in the ToInsert string.
-
-      // We also filter some internal lldb identifiers here. The user
-      // shouldn't see these.
-      if (StringRef(ToInsert).startswith("$__lldb_"))
+      llvm::Optional<CompletionWithPriority> CompletionAndPriority =
+          getCompletionForResult(R);
+      if (!CompletionAndPriority)
         continue;
-      if (!ToInsert.empty()) {
-        // Merge the suggested Token into the existing command line to comply
-        // with the kind of result the lldb API expects.
-        std::string CompletionSuggestion =
-            mergeCompletion(m_expr, m_position, ToInsert);
-        m_request.AddCompletion(CompletionSuggestion, Description);
-      }
+      m_completions.push_back(*CompletionAndPriority);
     }
   }
 
@@ -895,12 +944,13 @@ bool ClangExpressionParser::Complete(CompletionRequest &request, unsigned line,
   // the LLVMUserExpression which exposes the right API. This should never fail
   // as we always have a ClangUserExpression whenever we call this.
   ClangUserExpression *llvm_expr = cast<ClangUserExpression>(&m_expr);
-  CodeComplete CC(request, m_compiler->getLangOpts(), llvm_expr->GetUserText(),
+  CodeComplete CC(m_compiler->getLangOpts(), llvm_expr->GetUserText(),
                   typed_pos);
   // We don't need a code generator for parsing.
   m_code_generator.reset();
   // Start parsing the expression with our custom code completion consumer.
   ParseInternal(mgr, &CC, line, pos);
+  CC.GetCompletions(request);
   return true;
 }
 

diff  --git a/lldb/test/API/commands/expression/completion/TestExprCompletion.py b/lldb/test/API/commands/expression/completion/TestExprCompletion.py
index 5266266b6ab2..9ff9052bb3fc 100644
--- a/lldb/test/API/commands/expression/completion/TestExprCompletion.py
+++ b/lldb/test/API/commands/expression/completion/TestExprCompletion.py
@@ -201,26 +201,26 @@ def test_expr_completion_with_descriptions(self):
                                           '// Break here', self.main_source_spec)
 
         self.check_completion_with_desc("expr ", [
-            # VarDecls have their type as description.
-            ["some_expr", "Expr &"],
             # builtin types have no description.
             ["int", ""],
-            ["float", ""]
-        ])
+            ["float", ""],
+            # VarDecls have their type as description.
+            ["some_expr", "Expr &"],
+        ], enforce_order = True)
         self.check_completion_with_desc("expr some_expr.", [
             # Functions have their signature as description.
-            ["some_expr.Self()", "Expr &Self()"],
+            ["some_expr.~Expr()", "inline ~Expr()"],
             ["some_expr.operator=(", "inline Expr &operator=(const Expr &)"],
-            ["some_expr.FooNumbersBar1()", "int FooNumbersBar1()"],
+            # FieldDecls have their type as description.
+            ["some_expr.MemberVariableBar", "int"],
             ["some_expr.StaticMemberMethodBar()", "static int StaticMemberMethodBar()"],
-            ["some_expr.FooWithArgsBar(", "int FooWithArgsBar(int)"],
+            ["some_expr.Self()", "Expr &Self()"],
             ["some_expr.FooNoArgsBar()", "int FooNoArgsBar()"],
+            ["some_expr.FooWithArgsBar(", "int FooWithArgsBar(int)"],
+            ["some_expr.FooNumbersBar1()", "int FooNumbersBar1()"],
             ["some_expr.FooUnderscoreBar_()", "int FooUnderscoreBar_()"],
             ["some_expr.FooWithMultipleArgsBar(", "int FooWithMultipleArgsBar(int, int)"],
-            ["some_expr.~Expr()", "inline ~Expr()"],
-            # FieldDecls have their type as description.
-            ["some_expr.MemberVariableBar", "int"],
-        ])
+        ], enforce_order = True)
 
     def assume_no_completions(self, str_input, cursor_pos = None):
         interp = self.dbg.GetCommandInterpreter()

diff  --git a/lldb/tools/lldb-vscode/lldb-vscode.cpp b/lldb/tools/lldb-vscode/lldb-vscode.cpp
index 764eded8ce8d..f1620d945fbc 100644
--- a/lldb/tools/lldb-vscode/lldb-vscode.cpp
+++ b/lldb/tools/lldb-vscode/lldb-vscode.cpp
@@ -967,7 +967,7 @@ void request_completions(const llvm::json::Object &request) {
     text.c_str(),
     actual_column,
     0, -1, matches, descriptions);
-  size_t count = std::min((uint32_t)50, matches.GetSize());
+  size_t count = std::min((uint32_t)100, matches.GetSize());
   targets.reserve(count);
   for (size_t i = 0; i < count; i++) {
     std::string match = matches.GetStringAtIndex(i);