[PATCH v3 2/9] Add the initial TypoExpr AST node for delayed typo correction.

Kaelyn Takata rikka at google.com
Fri Jul 25 15:21:28 PDT 2014


On Fri, Jul 25, 2014 at 1:57 PM, Richard Smith <richard at metafoo.co.uk>
wrote:

> On Mon, Jul 14, 2014 at 4:55 PM, Kaelyn Takata <rikka at google.com> wrote:
>
>> ---
>>  include/clang/AST/DataRecursiveASTVisitor.h |  1 +
>>  include/clang/AST/Expr.h                    | 20 ++++++++++++++++++++
>>  include/clang/AST/RecursiveASTVisitor.h     |  1 +
>>  include/clang/Basic/StmtNodes.td            |  1 +
>>  include/clang/Sema/SemaInternal.h           |  3 +++
>>  include/clang/Sema/TypoCorrection.h         |  5 +++++
>>  lib/AST/Expr.cpp                            |  1 +
>>  lib/AST/ExprClassification.cpp              |  3 ++-
>>  lib/AST/ExprConstant.cpp                    |  1 +
>>  lib/AST/ItaniumMangle.cpp                   |  1 +
>>  lib/AST/StmtPrinter.cpp                     |  5 +++++
>>  lib/AST/StmtProfile.cpp                     |  4 ++++
>>  lib/Sema/SemaExceptionSpec.cpp              |  1 +
>>  lib/Sema/SemaLookup.cpp                     | 16 ++++++++++++++++
>>  lib/Sema/TreeTransform.h                    |  6 ++++++
>>  lib/Serialization/ASTReaderStmt.cpp         |  4 ++++
>>  lib/Serialization/ASTWriterStmt.cpp         |  6 ++++++
>>  lib/StaticAnalyzer/Core/ExprEngine.cpp      |  1 +
>>  tools/libclang/CXCursor.cpp                 |  1 +
>>  19 files changed, 80 insertions(+), 1 deletion(-)
>
>
> The AST library is supposed to be self-contained; it's a layering
> violation to put TypoExpr member definitions into Sema. Conversely, it'd be
> a layering violation to put them into AST, since they currently delete Sema
> types. To avoid this, I suggest:
>

I'd have actually preferred to keep TypoExpr out of the AST library
altogether since it is a synthetic node generated by Sema for use by Sema
and really shouldn't ever be seen/touched outside of Sema, as it is simply
a placeholder for Sema's typo correction. But of course defining AST nodes
that don't live in the AST library to a certain degree isn't possible... ah
well. I knew I'd have to deal with the layering issue at some point

* The TypoExpr class should instead store a single opaque 'void*' pointer,
> and make no attempt to deallocate or otherwise manage it.
> * Sema owns the pointee and is responsible for deleting it (you can delete
> it and zero out the pointer on the TypoExpr at the point when you handle
> the TypoExpr)
>
> Alternatively, since you need Sema to track the live TypoExprs anyway, you
> could have Sema maintain a map from TypoExpr* to whatever data you want to
> store for the TypoExprs, and have the AST node be essentially stateless.
>

I think I'll go with this approach since a TypoExpr node is just a
placeholder for internal use by Sema. For the mapping, do you think it
would be better to move the recovery callback and diagnostic generator into
the TypoCorrectionConsumer so that the map is simply from a TypoExpr to a
TypoCorrectionConsumer, or to place the three unique_ptrs in a tuple that
would go into the map, or to wrap the three unique_ptrs in a struct that is
stored in the map?

Incidentally, I wonder whether it'd be useful for TypoExpr to store the
> typoed identifier, maybe as an IdentifierInfo*? (This'd allow you to
> produce something more useful from the statement printer and ast dumper,
> and we really don't care much how big a TypoExpr is, since we'll --
> presumably -- never have many of them.) This might also be useful for
> clients that want to turn off typo correction but keep the TypoExprs in the
> AST.
>

As the typo correction is currently structured, a TypoExpr would never be
generated when typo correction is turned off since the TypoExpr is only for
Sema to know when and where to go back to perform typo correction when it
wasn't immediately performed. I'm also not sure how useful it would be to
have meaningful output from the statement printer and AST dumper since the
TypoExpr shouldn't be kept around that long (it should be gone by the time
the full statement has finished being parsed and analysed--I cannot think
of any cases where that wouldn't be true that wouldn't also be a bug; after
all, until the TypoExpr is handled Sema won't know if the error it
represents is recoverable or what the recovered AST should be). If/once
there's a good use case for having the identifier-to-be-corrected stored in
the TypoExpr, it'll be simple enough to add since it is available when the
TypoExpr is created and is also already being stored in the
TypoCorrectionConsumer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140725/6c632db3/attachment.html>


More information about the cfe-commits mailing list