[cfe-dev] [libclang] [python][request] Is clang_tokenize and clang_getTokenSpelling being added to the python bindings anytime soon?

Gregory Szorc gregory.szorc at gmail.com
Thu Jun 28 11:40:15 PDT 2012

I already have this Python code written in a personal branch and just 
need to integrate it with the official bindings. This involves a little 
refactoring to make the API nicer to consume, etc.

I'll try to submit a patch for review by the end of next week.

While I'm at it, I may also add support for the comment APIs recently 
added to libclang since they are somewhat related.


On 6/28/12 11:20 AM, Carlos Andrade wrote:
> Dears,
> I recently was able to make my -- very messed up - c version to output 
> the values of the cursors I wanted on libclang and very much 
> appreciate to know if there is any chance that the methods I listed on 
> the title of this e-mail *clang_tokenize* and *clang_getTokenSpelling 
> *would be available anytime soon.
> More specifically what I am looking for is this. If I recurse the tree 
> using python interface I get at the very most to the best of my knowledge:
> *piece of source sample: *    int loren = 2 *+* 2;
> *Cursor representing the binary operator (using _cursor.kind_, 
> _cursor.spelling_, _cursor.displayname_, _cursor.location_, 
> _cursor.hash_):*
> CursorKind.BINARY_OPERATOR *None*  <SourceLocation file 'simple.c', 
> line 11, column 14> 3289119033
>                         type: TypeKind.INT
>                         canonical type: TypeKind.INT
> As you can see, I _cant_ distinguish what is the binary operator (I 
> get None). And this apply for other cursors as well.
> However running the C code snipet from libclang which I found 
> somewhere on stackoverflow with my code I was able to extract the 
> token and obtain such information with some effort (I get the null 
> instead of None but since I have access to the tokens associated with 
> the Cursor I can reach it).
> The output is: BinaryOperator  (null)||| Start: Line: 11 Column: 14 
> Offset: 103 |||token = 2 token = + token = 2 token = ;
> _See code highlighted here:_ http://paste2.org/p/2062173
> ---------------------------------------------------------------------------------------------------
>     if (kindType.kind != CXType_Invalid)
>     {
>         CXSourceRange range = clang_getCursorExtent(cursor);
>         CXToken *tokens = 0;
>         unsigned int nTokens = 0;
> *clang_tokenize*(TU, range, &tokens, &nTokens);
>         for (unsigned int i = 0; i < nTokens; i++)
>         {
>             CXString spelling = *clang_getTokenSpelling*(TU, tokens[i]);
>             printf("token = %s\n", clang_getCString(spelling));
>             clang_disposeString(spelling);
>         }
>         clang_disposeTokens(TU, tokens, nTokens);
>     }
> ---------------------------------------------------------------------------------------------------
> I search on cindex and on the website branch 
> (http://llvm.org/svn/llvm-project/cfe/branches/tooling/bindings/python/clang/cindex.py) 
> but i didnt seen anything related to tokens so far.
> My motivation for asking this is that getting to see the tokens is 
> just a small fraction of my interest on it and I would like to stick 
> with python rather than trying it directly on C. Also I am mostly done 
> on my python code version where this is the only part holding me off 
> on moving forward while on C there are many things I don't see yet a 
> way to do it like I do on Python.
> Thank you very much for your attention.
> Best Regards,
> Carlos Andrade
> http://carlosandrade.co
> 2012/6/18 Carlos Andrade <carlosviansi at gmail.com 
> <mailto:carlosviansi at gmail.com>>
>     Thank you Gregory! I am very happy to know that! The available
>     methods on 3.1 work just as fine on the code so I am fine for now!
>     I might be using other functionalities in a near feature from lib
>     clang which might not be on the current python interface so I will
>     for sure check on this :)
>     I very much appreciate your offer on fast-tracking any specific
>     features and will keep that in mind!
>     Best Regards,
>     Carlos Andrade
>     http://carlosandrade.co
>     2012/6/18 Gregory Szorc <gregory.szorc at gmail.com
>     <mailto:gregory.szorc at gmail.com>>
>         On 6/18/12 10:13 AM, David Röthlisberger wrote:
>                 The reasons I stayed on C was because I saw few
>                 comments that python was still catching up on making
>                 available the methods provided by libclang,
>             That's true in the sense that any new feature must be
>             added to the C
>             interface before it can be added to the python bindings;
>             that is just
>             the nature of foreign bindings. But I don't know how far
>             behind the
>             python bindings are currently lagging (if at all). Gregory
>             Szorc (CCd)
>             is one of the maintainers of the python bindings --
>             perhaps he can
>             comment.
>         I have a branch of the Clang bindings that are nearly feature
>         complete at
>         https://github.com/indygreg/clang/tree/python_features.
>         However, I /think/ that tree may be busted right now, so use
>         at your own risk.
>         For the past ~6 months I've been slowly moving patches from my
>         repository into the mainline. It has been a long process. Now
>         that I have Manuel as a reliable reviewer, things could start
>         moving faster. I've just been busy with other projects.
>         If there is a particular feature missing from the in-tree
>         Python bindings, I probably have code for it somewhere. If
>         anyone asks kindly, I can probably fast-track specific
>         features to the main tree.
>         Gregory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120628/eb1ba31f/attachment.html>

More information about the cfe-dev mailing list