[cfe-dev] libclang and non-parsed pragmas

Meadows, Lawrence F lawrence.f.meadows at intel.com
Wed Nov 6 13:38:25 PST 2013


Dear cfe-dev:

I'm brand new to this list but I've done a bit of research and hacking on both clang internals and using the python binding to libclang.

The specific problem I'm having is trying to get identifier type information from a non-parsed pragma.  For concreteness I have a pragma statement similar to:

#pragma stuff var1 var2 var3

Of course clang doesn't know about #pragma stuff and when I compile a module using it I see nothing in the AST dump, as expected.

I'm able to point at the pragma line in the file with something like this:
    index = CL.Index.create()
    clangTu = index.parse(inputname)
    clangFile = CL.File.from_name(clangTu, inputname)
        pragmaStart = CL.SourceLocation.from_position(clangTu, clangFile,
            pragmaStartLine, 1)
        pragmaEnd = CL.SourceLocation.from_position(clangTu, clangFile,
            pragmaEndLine, len(line)-1)
        pragmaRange = CL.SourceRange.from_locations(pragmaStart, pragmaEnd)
        pragmaTokens = [k for k in clangTu.get_tokens(extent=pragmaRange)]

And it works fine, I get an array pragmaTokens and I can get the spelling, kind, and so forth. The problem occurs when I try to access the cursor for one of the tokens, e.g.:
(Pdb) p var
<clang.cindex.Token object at 0x22e1b90>
(Pdb) p var.cursor.spelling
None
(Pdb) p var.cursor.kind
CursorKind.COMPOUND_STMT
(Pdb) p var.cursor.location
<SourceLocation file 'dgemm.tmpl.c', line 23, column 1>
(Pdb) p var.location
<SourceLocation file 'dgemm.tmpl.c', line 26, column 36>
(Pdb) p var.kind
TokenKind.IDENTIFIER

You can see that the cursor is very confused, it's actually pointing at the closest enclosing compound statement, not the lexical location of the identifier in the pragma.

My theory is that since clang never parsed those tokens, it is just trying to find the closest source location that actually exists in the AST; stepping through libclang in gdb seems to confirm that.

What I really want to do is to look up the identifier in the context of the pragma's lexical location and find the variable to which it corresponds, as if that variable had actually been referenced in the source code; but I currently don't see anything that lets me do that.

I can think of a couple of hacks that will probably get me close enough (either try to figure out the source code completion stuff, or walk the variable declarations and build a dictionary) but I'd appreciate any words of wisdom from the clang experts.

Thanks.

-- Larry




More information about the cfe-dev mailing list