[cfe-dev] Extend Stmt with proper end location?

Thu Mar 26 10:45:48 PDT 2020

Hi,

by now I’ve extended LexerUtils to handle exactly what you are mentioning.

See ‘getUnifiedEndLoc’ in  <https://reviews.llvm.org/D75813#change-Bc7Hs8CXH7Z9> https://reviews.llvm.org/D75813#change-Bc7Hs8CXH7Z9 (and feel free to review ;-) )

findSemiAfterLocation is indeed very similar, but not quite the same.

Maybe those two approaches could be merged.

Is there anyone willing to touch findSemiAfterLocation as the last change is ~6 years ago?

@off:

As I was following up on that proof of concept and making all places handle the new members it did somewhat explode in my face. Co_return didn’t make life easier.

I gave up on improving the proof of concept – also as general feedback here seems to be clear. No change of Stmt.

Alex

Von: Whisperity <whisperity at gmail.com> 
Gesendet: Donnerstag, 26. März 2020 10:17
An: alex at lanin.de
Cc: Clang Dev <cfe-dev at lists.llvm.org>
Betreff: Re: [cfe-dev] Extend Stmt with proper end location?

While it most likely won't solve all cases, there's Lexer::getLocForEndOfToken. It can be used to create a range that grabs the semi at the end - assuming there is one... having to be context-sensitive on when we try grabbing it is still a requirement.

I do vaguely remember seeing it used in Tidy in a few places.

But other tools seems to have an issue with this too, looking at the docs for getLocForEndOfToken, it shows arcmt has a method named findSemiAfterLocation.

off {

  I'm in awe that extending Stmt classes worked. I tried something similar recently and it just exploded in my face.

}

On Fri, 13 Mar 2020, 10:08 via cfe-dev, <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> > wrote:

Hello,

currently Stmt getEndLoc returns slightly different results depending on Stmt type.
Specifically DoWhile, GotoStmt, ContinueStmt, BreakStmt, ReturnStmt, AsmStmt and SEHLeaveStmt do not track the location of the mandatory semicolon at the end.
(Expr is out of scope of this mail thread)

This is not really a high priority problem, but it makes some replacements in clang-tidy unnecessarily difficult.
Currently one has to differentiate by statement type and then parse past it's end skipping comments until a tok::semicolon within checkers.
Of course based on the last Stmt in case of children like an IfStmt without parenthesis.

However I feel this is a kind of an ugly workaround and Stmt.getEndLoc() should just return the proper end location for all statements incl all mandatory tokens.
To accomplish this the beforementioned statements require a new SourceLocation member.
My assumption is that this has little impact on memory & cache-locality, since those are not really high-occurrence statements - but I'm no expert.
Proof of concept is available here: https://reviews.llvm.org/D76108 (it has many many flaws, don't take it as ready for any kind of review).

Does it make sense to continue that way?

Regards,
Alexander Lanin

_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> 
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200326/b90eb8e4/attachment-0001.html>