[cfe-dev] Extend Stmt with proper end location?

Reid Kleckner via cfe-dev cfe-dev at lists.llvm.org
Fri Mar 27 15:35:35 PDT 2020


On Wed, Mar 25, 2020 at 5:08 PM Sam McCall via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> On Tue, Mar 17, 2020 at 6:11 AM John McCall via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> Furthermore, I think expressions are important to consider,
>>
>> because the practical limitations on finding the semicolon after
>> an expression are exactly the same as finding it after break.
>>
> To spell this out a little more: formally in `foo();` there's an
> expression-statement which consists of a the call expression and the
> semicolon. But clang just uses the CallExpr node to represent both, and
> CallExpr obviously(?) shouldn't include the semicolon in its source range.
>

Disclaimer: This is idle speculation, and I don't have any experience
writing refactoring tools.

I wonder if we could store the delimiter locations in a memory efficient
way by adding a second TrailingObject array to CompoundStmt. The overhead
would be 4 bytes for every semicolon in the TU, which is potentially a lot
when considering unused inline functions, but is not much when compared to
the overhead of the Stmt node itself and the 8 byte Stmt* pointer already
held in CompoundStmt.

There is of course the case of semicolon locations for non-compound
statements (`if (cond) foo();`), but one could invent a new AST node for
that case without wasting much memory.

It sounds like there are a lot of clang tools that use the lexer to search
for semicolons. Would it help to store them this way instead?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200327/feb3a354/attachment.html>


More information about the cfe-dev mailing list