[cfe-dev] Determining macros used in a function or SourceRange (using clang plugin)

Eric Bayer via cfe-dev cfe-dev at lists.llvm.org
Mon Sep 26 22:41:52 PDT 2016


First off thanks so much for your help (and probably patience at this 
point.)  Okay, that all works with a few tweaks.  I spent most of the 
day trying to figure out how I get the definition.  I have been looking 
at the getSpellingLoc() which seems to get me one end of it, but I can't 
seem to figure out how I find the end of the definition.  If this were 
just a string I'd look until I found a line break that wasn't preceeded 
with a \.  So far I tried constructing a lexer and using 
ReadToEndOfLine() and LexFromRawLexer() based on some things I found 
online.  Neither seemed to work.  My eventual goal is to get another 
SourceRange and check it for macros as well, etc, right now the return 
is StringRef just for debugging.  I.e. I want to check for any macro 
dependency trees.  I've attached the code below of what I tried. 
ReadToEndOfLine() seems to never advance anything, and LexFromRawLexer() 
seems to never come across an Tok::eod.  :/ Some output below the 
function clip.  Maybe there's an entirely easier approach?


SourceManager&SM) {

std::pair<FileID, unsigned>cur_info=SM.getDecomposedLoc(BeginLoc);
StringRefbuf=SM.getBufferData(cur_info.first, &invalid);

if(invalid) {

// Get the point in the buffer
constchar*point=buf.data() +cur_info.second;

// Make a lexer and point it at our buffer and offset
Lexerlexer(SM.getLocForStartOfFile(cur_info.first), LangOpts,
buf.begin(), point, buf.end());

while(1) {
// read through the end of line

if(text.back() !='\\') {

llvm::errs() <<"Incomplete line, so far: "<<
getCodeString(SM, BeginLoc, lexer.getFileLoc(), "Token") <<"\n";

returngetCodeString(SM, BeginLoc, lexer.getFileLoc(), "Definition");
Token tok;
while(1) {

if(tok.is(tok::eof) || tok.is(tok::eod)) {

llvm::errs() << "Token[" << tok.getName() << "]: \"" <<
getCodeString(SM, tok.getLocation(), tok.getEndLoc(), "Token") <<

returngetCodeString(SM, BeginLoc, tok.getEndLoc(), "Definition");

Example failure on tokens:  (and ignore the fact that we're sorta 
printing out two tokens on every line as getEndLoc() seems to really be 
the next token and getCodeString() seems to print on token boundaries.)

Macro name: ASSERT
Macro string: ASSERT((getFirstMatchingOnly && firstMatching != nullptr) ||
           (!getFirstMatchingOnly && (allMatchingMo != nullptr ||
                                      allMatchingMoRef != nullptr)))
Token[raw_identifier]: "ASSERT_IFNOT("
Token[l_paren]: "(cond"
Token[raw_identifier]: "cond,"
Token[comma]: ","
Token[raw_identifier]: "_ASSERT_PANIC("
Token[l_paren]: "(AssertAssert"
Token[raw_identifier]: "AssertAssert)"
Token[r_paren]: "))"
Token[r_paren]: ")"                                   <---- I'd expect a 
eod token here.  Guessing though.
Token[hash]: "#define"
Token[raw_identifier]: "define"

On 9/26/2016 3:12 PM, Alex L wrote:
> On 26 September 2016 at 14:55, Eric Bayer <ebayer at vmware.com 
> <mailto:ebayer at vmware.com>> wrote:
>     Thanks Alex,
>     That gets me mostly there.  Pardon if that is a dumb question, but
>     I'm not sure how I go from a SourceLocation to a Token.  I have
>     not worked at all in the preprocessor levels before.
> Something like this should work:
>     StringRef getToken(SourceLocation BeginLoc, SourceManager &SM, 
> LangOptions &LangOpts) {
>       const SourceLocation EndLoc = 
> Lexer::getLocForEndOfToken(BeginLoc, 0, SM, LangOpts);
>       return 
> Lexer::getSourceText(CharSourceRange::getTokenRange(BeginLoc, EndLoc), 
> SM, LangOpts);
>     }

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160926/4ed29110/attachment.html>

More information about the cfe-dev mailing list