[cfe-dev] from source location to token?

Roberto Bagnara bagnara at cs.unipr.it
Mon Nov 24 23:38:52 PST 2008


Chris Lattner wrote:
> On Nov 24, 2008, at 7:53 AM, Roberto Bagnara wrote:
>>> It would be interesting to know a little bit more about what you are
>>> doing. In general, re-lexing/re-parsing isn't very desirable (though
>>> it may be appropriate if it's uncommon).
>>
>> Hi Steve,
>>
>> there are program analyses that, for instance:
>>
>> 1) need to reason on an exact representation of floating-point literals
>>    (any approximation may result into an unsound analysis);
> 
> Hi Roberto,
> 
> You don't need the original Token to get this.  Clang's source location 
> information is so precise that (given a SourceLocation) you can 
> arbitrarily re-lex a token later.  This property is actually inherent to 
> how we handle SourceRange's: the source range points to the *start* of 
> the first/end token in the range.  To get render through the end of the 
> token, the diagnostics machinery re-lexes the token, which gives exactly 
> the original spelling and length.
> 
> You can see this in action with the command line driver.  Consider this 
> program:
> 
> struct s;
> void foo(struct s *s) {
>    *s + 0.12321e-42;
> }
> 
> -fsyntax-only prints:
> 
> t.c:7:7: error: invalid operands to binary expression ('struct s' and 
> 'double')
>    *s + 0.12321e-42;
>    ~~ ^ ~~~~~~~~~~~
> 
> The way it gets the end of the fp literal is to relex the token with 
> code in Lexer::MeasureTokenLength.
> 
> Given a SourceLocation and the length of the token (as returned by 
> MeasureTokenLength) you can get the exact original spelling (including 
> trigraphs and escaped newlines, beware).  The pointer to the start of 
> the string is obtained with:
> 
> const char *StrData = SourceMgr.getCharacterData(Loc);
> 
>>
>> 2) need to reason on the textual representation that was used in the
>>    program also for integer literals (for example, there are coding
>>    rules that forbid the use of octal constants: the analyzer should
>>    flag their use in the source program).
> 
> Sure, Clang can handle this sort of thing with no problem.

Great, thanks!
All the best,

    Roberto

-- 
Prof. Roberto Bagnara
Computer Science Group
Department of Mathematics, University of Parma, Italy
http://www.cs.unipr.it/~bagnara/
mailto:bagnara at cs.unipr.it



More information about the cfe-dev mailing list