[cfe-dev] Stmt.getLocEnd()

Robert Ankeney rrankene at gmail.com
Wed Feb 6 18:35:37 PST 2013


That's interesting, as I'm doing code instrumentation and was doing
something similar.  But there are exceptions, such as:

for (...)
  for (...)
  { ... }

in which case the body is not a compound statement, and the getLocEnd()
call points to just before the ending brace.  Likewise an if, while, switch
or similar statement immediately after the for.  What I ended up doing (not
pretty) is getting the character at the getLocEnd() position and seeing if
it is a '}'.  If so, I use getLocWithOffset(1) to point past, else use
findLocationAfterToken().  You can look at the data with a call to
SourceManager's getCharacterData().


On Wed, Feb 6, 2013 at 4:43 PM, Antoine Trouve <trouve at isit.or.jp> wrote:

> Thank you for your remark.
>
> As a matter of fact, I take into account the two following situations:
>
> - if the body of the for is a compound statement, I expect it to be
> delimited by braces. Therefore, I call getLocEnd() and expect it to point
> just before the ending brace, then getLocForEndOfToken() to get the end.
>
> - otherwise, I expect it to be a statement that ends with a semicolom. In
> this case I call findLocationAfterToken() as kindly suggested by other
> people in this thread
>
> I tried with a large benchmark (SPEC2006) and it seems that it works.
>
> And I agree with your last statement :p
>
> - Antoine
>
> Le H.25/02/07 à 1:36, Robert Ankeney <rrankene at gmail.com> a écrit :
>
> > Keep in mind that the body of a for statement can be not only a
> CompoundStmt ending in tok::r_brace, but any statement, including another
> "for" statement, which in turn could end in either a semicolon or brace.
>  As I recall, getLocEnd in any of these cases will point to the token
> before either the ";" or the "}", but you might want to check that.  And
> you can probably check the result of loc =
> Lexer::findLocationAfterToken(...) for loc.isValid() to see if the token
> was indeed found there.
> >
> > It seems like some generic Stmt call such as getLocForEndOfStmt() would
> really be useful.
> >
> > Robert
> >
> > >
> > > On Feb 4, 2013, at 17:37 , Antoine Trouve <trouve at isit.or.jp> wrote:
> > >
> > >> My bad, I was using the function the wrong way.
> > >>
> > >> But I noticed that I couldn't go through a semicolon using
> "Lexer::getLocForEndOfToken" if there is a space before the semicolon.
> > >>
> > >> For instance, let's consider this code:
> > >>
> > >>      for(i=0; i<mand(N,N); i++) res ++ ;
> > >>
> > >> Initially, the SourceLocation retreieved with "getLocEnd()" is before
> the "++":
> > >>
> > >>      for(i=0; i<mand(N,N); i++) res /*HERE*/++ ;
> > >>
> > >> If I call "Lexer::getLocForEndOfToken", it will point to after "++":
> nice:
> > >>
> > >>      for(i=0; i<mand(N,N); i++) res ++/*HERE*/ ;
> > >>
> > >> Then if I call again getLocForEndOfToken, the result will point to
> the exact same location (I need to call "getLocWithOffset")
> > >>
> > >>      for(i=0; i<mand(N,N); i++) res ++/*STILL HERE*/ ;
> > >>
> > >> In the case I don't have any space before the semicolon, the return
> value of the second call to getLocForEndOfToken will point to after it:
> > >>
> > >>      for(i=0; i<mand(N,N); i++) res ++;/*HERE*/
> > >>      (no space before the ";")
> > >>
> > >> Is that the expected behaviour ? I find it pretty annoying in my very
> situation because I have no choice but looping with
> "SourceLocation::getLocWithOffset" until I find a ";".
> > >
> > > Hm. It's expected behavior because an Expr can be nested inside other
> Exprs, in which case you only want the beginning and end of the Expr to
> include the expression itself. Consider "a + b * c;" The semicolon is not
> part of "b * c"; you could argue it's part of "a + b * c", but that's not
> consistent. On the other hand, we otherwise don't track the location of the
> semicolon anywhere.
> > >
> > > I can see this being an actual deficiency we want to fix; if you're
> interested in pursuing this, please file a bug report at
> http://llvm.org/bugs/. Meanwhile, Lexer::findLocationAfterToken will
> probably be a more resilient way to find the semicolon.
> >
> > Yes, the behaviour is not consistent to my point of view. I'll file a
> bug report.
> > Anyway, I'm using "findLocationAfterToken" as you suggested and it works
> like a charm: thank you for your help !
> >
> > - Antoine
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130206/18f1f435/attachment.html>


More information about the cfe-dev mailing list