[cfe-users] Syntax Parser using Clang

david.weber at l-3com.com david.weber at l-3com.com
Thu Sep 19 13:11:38 PDT 2013


I'm attempting to write a syntax parser using clang to let me list out tokens (variables, types, etc) for another toolset.


I have a file specified as


#include "bogus.h"
const bogus::bogus_type bogus::bogus_val = 0;
int main(){
     int alpha = 40;
     return(-1);
}



and I would expect to get a list of something like:
                bogus::bogus_type
                bogus_bogus_val
                alpha
                main
                ...
                (built in values as well)


I'm using the C API, and for simple C files,  it works fine.  However once I start going into C++ bodies, especially when the header file isn't found (again, I only care about syntax, not making sure its valid), it's giving me only:

alpha
main

and ignoring the bogus values.  I'm almost thinking that I am basically using too powerful of a tool for what I need, and I should find something "dumber".


My main follows (whole thing also attached)

Compile with:
g++ parse.cpp -lclang -L/usr/lib64/llvm -o parse


int main(int argc, char* argv[]){
    init_filter();

    CXIndex index = clang_createIndex(1, 1);

    unsigned int options = CXTranslationUnit_None;

    // We don't want to expand any #include statements
    // so disable the standard include locations
    const unsigned int num_args = 2;
    const char* const args[num_args] = {
        "-nostdlibinc",
        "-nostdinc"
    };

    std::cout << "-----------------" << std::endl;
    // Parse the file
    CXTranslationUnit tu =
        clang_parseTranslationUnit   (
            index,       // index to associate w/ this translation unit
            argv[1],  // source file name
            args,        // number command line args
            num_args,    // command line args
            0,           // number of unsaved files
            NULL,        // unsaved files
            options
        );

    std::cout << "-----------------" << std::endl;

    // Get a cursor into the parsed file
    CXCursor cursor = clang_getTranslationUnitCursor(tu);

    if(clang_Cursor_isNull(cursor)){
        std::cout << "Cursor was NULL!" << std::endl;
        exit(-1);
    }

    // Visit the children
    clang_visitChildren(cursor, visitor, NULL);

    // Print out the unique tokens
    std::cout << std::endl << "Unique Tokens:" << std::endl;
    for(tokenSet_t::iterator iter = token_set.begin();
        iter != token_set.end();
        ++iter)
    {
        std::cout << "\t" << *iter << std::endl;
    }

    return 0;
}



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-users/attachments/20130919/6f07d89b/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: simple.cpp
URL: <http://lists.llvm.org/pipermail/cfe-users/attachments/20130919/6f07d89b/attachment.ksh>


More information about the cfe-users mailing list