[Lldb-commits] [PATCH] D31451: New C++ function name parsing logic

Pavel Labath via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Thu Mar 30 09:38:37 PDT 2017


labath added a comment.

I cant help but feel that this could have been done in a simpler way, but then again, some of the cases you have dug up are quite tricky. 
I think we should do some performance measurements to see whether this needs more optimising or it's fine as is.

I propose the following benchmark:
bin/lldb bin/clang

- make sure clang is statically linked

breakpoint set --name non_exisiting_function

Clang needs to be statically linked so we can access all its symbols without running​ it (which would skew the benchmark) -- this is the reason we cannot use lldb itself as most of its symbols are in liblldb.

Setting the breakpoint on a nonexistent function avoids us timing the breakpoint setting machinery, while still getting every symbol in the executable parsed.

If the performance is ok i am quite happy with this, apart from some stylistic nits.



================
Comment at: source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp:94
   if (!m_parsed && m_full) {
-    //        ConstString mangled;
-    //        m_full.GetMangledCounterpart(mangled);
-    //        printf ("\n   parsing = '%s'\n", m_full.GetCString());
-    //        if (mangled)
-    //            printf ("   mangled = '%s'\n", mangled.GetCString());
-    m_parse_error = false;
-    m_parsed = true;
-    llvm::StringRef full(m_full.GetCString());
-
-    size_t arg_start, arg_end;
-    llvm::StringRef parens("()", 2);
-    if (ReverseFindMatchingChars(full, parens, arg_start, arg_end)) {
-      m_arguments = full.substr(arg_start, arg_end - arg_start + 1);
-      if (arg_end + 1 < full.size())
-        m_qualifiers = full.substr(arg_end + 1);
-      if (arg_start > 0) {
-        size_t basename_end = arg_start;
-        size_t context_start = 0;
-        size_t context_end = llvm::StringRef::npos;
-        if (basename_end > 0 && full[basename_end - 1] == '>') {
-          // TODO: handle template junk...
-          // Templated function
-          size_t template_start, template_end;
-          llvm::StringRef lt_gt("<>", 2);
-          if (ReverseFindMatchingChars(full, lt_gt, template_start,
-                                       template_end, basename_end)) {
-            // Check for templated functions that include return type like:
-            // 'void foo<Int>()'
-            context_start = full.rfind(' ', template_start);
-            if (context_start == llvm::StringRef::npos)
-              context_start = 0;
-            else
-              ++context_start;
-
-            context_end = full.rfind(':', template_start);
-            if (context_end == llvm::StringRef::npos ||
-                context_end < context_start)
-              context_end = context_start;
-          } else {
-            context_end = full.rfind(':', basename_end);
-          }
-        } else if (context_end == llvm::StringRef::npos) {
-          context_end = full.rfind(':', basename_end);
-        }
-
-        if (context_end == llvm::StringRef::npos)
-          m_basename = full.substr(0, basename_end);
-        else {
-          if (context_start < context_end)
-            m_context =
-                full.substr(context_start, context_end - 1 - context_start);
-          const size_t basename_begin = context_end + 1;
-          m_basename =
-              full.substr(basename_begin, basename_end - basename_begin);
-        }
-        m_type = eTypeUnknownMethod;
-      } else {
-        m_parse_error = true;
-        return;
-      }
-
-      if (!IsValidBasename(m_basename)) {
-        // The C++ basename doesn't match our regular expressions so this can't
-        // be a valid C++ method, clear everything out and indicate an error
-        m_context = llvm::StringRef();
-        m_basename = llvm::StringRef();
-        m_arguments = llvm::StringRef();
-        m_qualifiers = llvm::StringRef();
-        m_parse_error = true;
-      }
+    CPlusPlusNameParser parser(m_full.GetStringRef());
+    auto function = parser.ParseAsFunctionDefinition();
----------------
How about the following api:
```
if (auto function​ = CPlusPlusNameParser::ParseAsFunctionDefinition(m_full.GetStringRef())) {
  ...
```


================
Comment at: source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp:160
+  auto full_name = parser.ParseAsFullName();
+  if (full_name.hasValue()) {
+    identifier = full_name.getValue().m_basename;
----------------
Same here


================
Comment at: source/Plugins/Language/CPlusPlus/CPlusPlusNameParser.cpp:62
+bool CPlusPlusNameParser::HasMoreTokens() {
+  return m_next_token_index < static_cast<int>(m_tokens.size());
+}
----------------
Wouldn't it be better to change the type of m_next_token_index to size_t?


================
Comment at: source/Plugins/Language/CPlusPlus/CPlusPlusNameParser.cpp:188
+  // as arithmetic or shift operators not as template brackets.
+  // Examples: std::enable_if<(10u)<(64), bool>
+  //           f<A<operator<(X,Y)::Subclass>>
----------------
Is this really the case for the types we are interested in? I would have hoped that this would get simplified to `std::enable_if<true, bool> before it reaches us?


================
Comment at: source/Plugins/Language/CPlusPlus/CPlusPlusNameParser.cpp:572
+  clang::SourceLocation end_location = last_token.getLocation();
+  const char *begin_ptr = m_text.data() + begin_location.getRawEncoding();
+  const char *end_ptr =
----------------
Could this be written as: 
`m_text.take_front(end_pos).drop_front(start_pos)`?


================
Comment at: source/Plugins/Language/CPlusPlus/CPlusPlusNameParser.h:33
+
+  struct ParsedName {
+    llvm::StringRef m_basename;
----------------
I think we dont put m_ for fields of dumb structs that are meant to be accessed directly.


================
Comment at: source/Plugins/Language/CPlusPlus/CPlusPlusNameParser.h:94
+  // when destructed unless it's manually removed with Remove().
+  class Bookmark {
+  public:
----------------
Please make the type move-only. Otherwise you will have a fun time debugging accidental copies. (You already have one, although it is benign now)


================
Comment at: unittests/Language/CPlusPlus/CPlusPlusLanguageTest.cpp:106
+    EXPECT_EQ(test.qualifiers, method.GetQualifiers())
+        << method.GetQualifiers();
     EXPECT_EQ(test.scope_qualified_name, method.GetScopeQualifiedName());
----------------
Would defining operator<< for std::ostream and StringRef ( local tomthis test) enable you to get rid of these?
I've ran into this before but was never annoyed enough to actually do it.. :/


https://reviews.llvm.org/D31451





More information about the lldb-commits mailing list