[cfe-dev] FW: get the text in aC file between 2 SourceLocations
steve naroff
snaroff at apple.com
Wed Sep 10 08:05:51 PDT 2008
On Sep 9, 2008, at 9:48 PM, b.j.burgers at student.utwente.nl wrote:
> Thanks for your quick response Snaroff.
> I choose to change the function Lexer::LexNumericConstant and
> generate my own kind of token if I find an s at the end of an
> numerical_constant.
> Now I want to separate the number from the s.
>
> Is there an easy way of getting the text of a token in the C file?
Not directly. You need to go through the SourceManager as follows...
const char *sourceText = SM->getCharacterData(Tok.getLocation());
If you want the token type name, this will suffice...
const char *tokenName = Tok.getName();
>
> I searched for this a while ago too but couldn’t find it. I found
> the SourceManager.GetCharacterData(Sourcelocation ) function, but
> this returns me all characters starting from the Sourceloaction.
> Is there a way to get a char * of text between 2 SourceLocations ?
>
These should do the trick...
// converts SourceLocation's into "char *'s"
const char *startBuf = SM->getCharacterData(LocStart);
const char *endBuf = SM->getCharacterData(LocEnd);
// converts a "char *" offset into a SourceLocation
SourceLocation OptionalLoc = LocStart.getFileLocWithOffset(p-
startBuf);
snaroff
> Thanks for all the help,
>
> Bas
>
> Van: steve naroff [mailto:snaroff at apple.com]
> Verzonden: dinsdag 9 september 2008 15:02
> Aan: Burgers, B.J. (Bas, Student EMSYS)
> CC: cfe-dev at cs.uiuc.edu
> Onderwerp: Re: [cfe-dev] changing the lexer or parser
>
> Hi Bas,
>
> clang currently implements C integer constants by including the
> trailing suffix (see C99 6.4.4.1 for more details).
>
> Sema::ActOnNumbericConstant() is then responsible for determining
> the type of constant (integer, floating) and size.
>
> I haven't thought about adapting clang's lexer to generate tokens
> that don't conform to C.
>
> That said, you could simply examine the "suffix" by hand (without
> fiddling with the lexer directly).
>
> snaroff
>
> On Sep 9, 2008, at 5:27 PM, b.j.burgers at student.utwente.nl wrote:
>
>
> Hello,
>
> I’m working on a tool that allows time construct in C. I implemented
> this tool by adapting Clang.
> In these time constructs I like to allow arguments like “1000s”,
> “1000 s”, “1000 s”, “100ms”, “100 ms”, etc.
> The lexer creates 1 token called numerical_token if the argument is
> “1000s” even if ‘s’ is added as keyword or token in TokenKinds.def.
> I hoped the lexer would have generated two tokens, 1
> numerical_constant and an identifier (or self defined token).
> What is the best way to allow these kind of arguments ? Do I have to
> create a new token that allows some digits followed by an ‘s’?
>
> Thanks for any help,
>
> Bas Burgers
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20080910/e69bd572/attachment.html>
More information about the cfe-dev
mailing list