[PATCH] First revision of the dynamic ASTMatcher library

Thu May 2 10:54:33 PDT 2013

  I think this lexer could get some refactoring.  First, introduce numeric token codes for all tokens.  Replace ad-hoc logic in getNextToken with a switch().  It matters not because of performance, but because of consistency with other lexers/parsers we have.

  As an example of a design to look at, you could look at AST/CommentLexer (but you don't need the complexity 'LexerState', though).

  VariantValue looks unsafe to me.  It could have more assertions for type checks (also, why not use PointerUnion inside?).  Using 'std::string*' is a bit leak-prone.

================
Comment at: include/clang/ASTMatchers/Dynamic/Diagnostics.h:91
@@ +90,3 @@
+  };
+  const llvm::ArrayRef<ErrorFrame> frames() const { return Frames; }
+
----------------
const llvm::ArrayRef -> llvm::ArrayRef, ArrayRef is intrinsically const.

================
Comment at: include/clang/ASTMatchers/Dynamic/Parser.h:72
@@ +71,3 @@
+    processMatcherToken(const TokenInfo &MatcherInfo,
+                        llvm::ArrayRef<ParserValue> Args,
+                        Diagnostics *Error) = 0;
----------------
If you #include "clang/Basic/LLVM.h", you can drop 'llvm::' qualifications from StringRef and ArrayRef.

================
Comment at: include/clang/ASTMatchers/Dynamic/Parser.h:42-43
@@ +41,4 @@
+
+/// \brief Matcher expression parser.
+class Parser {
+public:
----------------
This class just holds a few static member functions.  Is the class really needed here?

================
Comment at: lib/ASTMatchers/Dynamic/Parser.cpp:36
@@ +35,3 @@
+  ///
+  /// /return The TokenInfo for the next token. It will continually return the
+  ///   empty token when the end of the input has been reached.
----------------
\return

================
Comment at: lib/ASTMatchers/Dynamic/Parser.cpp:57
@@ +56,3 @@
+      size_t TokenLength = 0;
+      while (TokenLength < Code.size() && !isspace(Code[TokenLength]) &&
+             !isSingleCharToken(Code[TokenLength]))
----------------
Please use functions from clang/Basic/CharInfo.h instead of ctype.

================
Comment at: lib/ASTMatchers/Dynamic/Parser.cpp:45
@@ +44,3 @@
+    if (Code.empty()) {
+      // Nothing to do.
+    } else if (isSingleCharToken(Code[0])) {
----------------
I don't think this is the best design (to produce an unlimited number of empty tokens at the end); all other lexers in LLVM and Clang usually produce an EOF token.

================
Comment at: lib/ASTMatchers/Dynamic/Parser.cpp:98
@@ +97,3 @@
+  void consumeWhitespace() {
+    while (!Code.empty() && isspace(Code[0])) {
+      if (Code[0] == '\n') {
----------------
Use clang/Basic/CharInfo.h

================
Comment at: lib/ASTMatchers/Dynamic/Parser.cpp:145
@@ +144,3 @@
+  const TokenInfo &OpenToken = Tokenizer->getNextToken();
+  if (OpenToken.Token != "(") {
+    Error->pushErrorFrame(OpenToken, Error->ET_PARSER_NO_OPEN_PAREN);
----------------
String comparisons to compare tokens is not a good idea.

================
Comment at: lib/ASTMatchers/Dynamic/VariantValue.cpp:60-61
@@ +59,4 @@
+void VariantValue::setImpl(const std::string &NewValue) {
+  Type = VT_String;
+  Value.String = new std::string(NewValue);
+}
----------------
What if someone calls setImpl() repeatedly?  Do we just leak old values?

http://llvm-reviews.chandlerc.com/D714

BRANCH
  svn

ARCANIST PROJECT
  clang