[PATCH] D59887: [Syntax] Introduce TokenBuffer, start clangToolingSyntax library
Sam McCall via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue May 7 07:45:39 PDT 2019
sammccall added inline comments.
================
Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:216
+ llvm::dbgs()
+ << "$[collect-tokens]: "
+ << syntax::Token(T).dumpForTests(SourceMgr) << "\n"
----------------
what's "$[collect-tokens]"?
Maybe just "Token: "?
================
Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:224
+
+TokenBuffer TokenCollector::consume() && {
+ TokenBuffer B(SourceMgr);
----------------
this function is now >100 lines long. Can we split it up?
================
Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:225
+TokenBuffer TokenCollector::consume() && {
+ TokenBuffer B(SourceMgr);
+ for (unsigned I = 0; I < Expanded.size(); ++I) {
----------------
this could be a member, which would help splitting up
================
Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:226
+ TokenBuffer B(SourceMgr);
+ for (unsigned I = 0; I < Expanded.size(); ++I) {
+ auto FID =
----------------
Is there a reason we do this at the end instead of as tokens are collected?
================
Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:243
+ llvm::DenseMap<FileID, unsigned> NextSpelled;
+ auto ConsumeSpelledUntil = [&](TokenBuffer::MarkedFile &File,
+ SourceLocation L) {
----------------
consumespelleduntil and fillgapuntil could be methods, I think
================
Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:277
+ for (unsigned I = 0; I < Expanded.size() - 1; ++I) {
+ auto L = Expanded[I].location();
+ if (L.isFileID()) {
----------------
the body here could be a method too, I think
i.e. for each (expanded token), process it?
================
Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:322
+
+ // Some files might have unaccounted spelled tokens at the end.
+ for (auto &F : B.Files) {
----------------
and similarly this section
================
Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:130
+ OS << llvm::formatv(
+ " ['{0}'_{1}, '{2}'_{3}) => ['{4}'_{5}, '{6}'_{7})\n",
+ PrintToken(File.SpelledTokens[M.BeginSpelled]), M.BeginSpelled,
----------------
sammccall wrote:
> As predicted :-) I think these `_<index>` suffixes are a maintenance hazard.
>
> In practice, the assertions are likely to add them by copy-pasting the test output.
>
> They guard against a class of error that doesn't seem very likely, and in practice they don't even really prevent it (because of the copy/paste issue).
>
> I'd suggest just dropping them, and living with the test assertions being slightly ambiguous.
>
> Failing that, some slightly trickier formatting could give more context:
>
> `A B C D E F` --> `A B ... E F`
>
> With special cases for smaller numbers of tokens. I don't like the irregularity of that approach though.
(this one is still open)
================
Comment at: clang/unittests/Tooling/Syntax/TokensTest.cpp:622
+
+// FIXME: add tests for mapping spelled tokens into offsets.
+
----------------
sammccall wrote:
> please fix :-)
(still missing?)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D59887/new/
https://reviews.llvm.org/D59887
More information about the cfe-commits
mailing list