[Lldb-commits] [PATCH] D54454: Be more permissive in what we consider a variable name.

Zachary Turner via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Mon Nov 12 15:40:26 PST 2018


zturner created this revision.
zturner added reviewers: clayborg, jingham, labath.
Herald added subscribers: JDevlieghere, aprantl.

When we evaluate a variable name as part of an expression, we run a regex against it so break it apart.  The intent seems to be that we want to get the main variable name (i.e. the part of the user input which is both necessary and sufficient to find the record in debug info), but leave the rest of the expression alone (for example the variable could be an instance of a class, and you have written `variable.member`.

But I believe the current regex to be too restrictive.  For example, it disallows variable templates, so for example if the user writes `variable<int>` we would strip off the `<int>`, but this is absolutely necessary to find the proper record in the debug info.  It also doesn't allow things like ` 'anonymous namespace'::variable` which under the Microsoft ABI is a valid thing.  Nor does it permit spaces, so we couldn't have something like `foo<long double>` (assuming we first fixed the template issue).

Rather than try to accurately construct a regex for the set of all possible things that constitute a variable name, it seems easier to construct a regex to match all the things that **do not** constitute a variable name.  Specifically, an occurrence of the . operator or -> operator, since that's what ultimately defines the beginning of a sub-expression.

So this changes the regex accordingly.


https://reviews.llvm.org/D54454

Files:
  lldb/source/Symbol/Variable.cpp


Index: lldb/source/Symbol/Variable.cpp
===================================================================
--- lldb/source/Symbol/Variable.cpp
+++ lldb/source/Symbol/Variable.cpp
@@ -383,8 +383,12 @@
   } break;
 
   default: {
-    static RegularExpression g_regex(
-        llvm::StringRef("^([A-Za-z_:][A-Za-z_0-9:]*)(.*)"));
+    // A variable name can be something like foo, foo::bar, foo<int>::bar,
+    // ::foo, foo<long double>::bar, and more.  Rather than trying to construct
+    // a perfect regex, which is almost certainly going to lead to some edge
+    // cases that we don't handle, let's just take everything until the first
+    // . operator or -> operator.
+    static RegularExpression g_regex("^([^.-]*)(.*)");
     RegularExpression::Match regex_match(1);
     std::string variable_name;
     variable_list.Clear();


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D54454.173777.patch
Type: text/x-patch
Size: 840 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/lldb-commits/attachments/20181112/6ccbcb7b/attachment.bin>


More information about the lldb-commits mailing list