[Lldb-commits] [lldb] [LLDB] Add Lexer (with tests) for DIL (Data Inspection Language). (PR #123521)
Pavel Labath via lldb-commits
lldb-commits at lists.llvm.org
Mon Jan 27 04:37:28 PST 2025
================
@@ -0,0 +1,158 @@
+//===-- DILLexer.h ----------------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLDB_VALUEOBJECT_DILLEXER_H_
+#define LLDB_VALUEOBJECT_DILLEXER_H_
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Error.h"
+#include <cstdint>
+#include <limits.h>
+#include <memory>
+#include <string>
+#include <vector>
+
+namespace lldb_private {
+
+namespace dil {
+
+/// Class defining the tokens generated by the DIL lexer and used by the
+/// DIL parser.
+class Token {
+public:
+ enum Kind {
+ coloncolon,
+ eof,
+ identifier,
+ invalid,
+ kw_namespace,
+ l_paren,
+ none,
+ r_paren,
+ unknown,
+ };
+
+ Token(Kind kind, std::string spelling, uint32_t start)
+ : m_kind(kind), m_spelling(spelling), m_start_pos(start) {}
+
+ Token() : m_kind(Kind::none), m_spelling(""), m_start_pos(0) {}
+
+ void SetKind(Kind kind) { m_kind = kind; }
+
+ Kind GetKind() const { return m_kind; }
+
+ std::string GetSpelling() const { return m_spelling; }
+
+ uint32_t GetLength() const { return m_spelling.size(); }
+
+ bool Is(Kind kind) const { return m_kind == kind; }
+
+ bool IsNot(Kind kind) const { return m_kind != kind; }
+
+ bool IsOneOf(Kind kind1, Kind kind2) const { return Is(kind1) || Is(kind2); }
+
+ template <typename... Ts> bool IsOneOf(Kind kind, Ts... Ks) const {
+ return Is(kind) || IsOneOf(Ks...);
+ }
+
+ uint32_t GetLocation() const { return m_start_pos; }
+
+ static llvm::StringRef GetTokenName(Kind kind);
+
+private:
+ Kind m_kind;
+ std::string m_spelling;
+ uint32_t m_start_pos; // within entire expression string
+};
+
+/// Class for doing the simple lexing required by DIL.
+class DILLexer {
+public:
+ DILLexer(llvm::StringRef dil_expr) : m_expr(dil_expr) {
+ m_cur_pos = m_expr.begin();
+ // Use UINT_MAX to indicate invalid/uninitialized value.
+ m_tokens_idx = UINT_MAX;
+ m_invalid_token = Token(Token::invalid, "", 0);
+ }
+
+ llvm::Expected<bool> LexAll();
+
+ /// Return the lexed token N+1 positions ahead of the 'current' token
----------------
labath wrote:
I feel like there are too many ways to navigate the token stream here. You can either call GetCurrentToken+IncrementTokenIdx, or GetNextToken(which I guess increments the index automatically), or LookAhead+AcceptLookAhead.
I think it would be better to start with something simple (we can add more or revamp the existing API if it turns out to be clunky). What would you say to something like:
```
const Token &LookAhead(uint32_t N /* add `=1` if you want*/);
const Token &GetCurrentToken() { return LookAhead(0); } // just a fancy name for a look ahead of zero
void Advance(uint32_t N = 1); // advance the token stream
```
https://github.com/llvm/llvm-project/pull/123521
More information about the lldb-commits
mailing list