[cfe-dev] Auto-generation of ASTMatchers predicates from source code, proof-of-concept

Manuel Klimek klimek at google.com
Tue Jun 19 05:57:17 PDT 2012


On Tue, Jun 19, 2012 at 2:53 PM, Evgeny Panasyuk
<evgeny.panasyuk at gmail.com>wrote:

>  19.06.2012 16:35, Manuel Klimek wrote:
>
>         Or maybe about some interactive (maybe gui) tool for building
>>>> predicates? I remember that Chandler mentioned about something similar at
>>>> http://www.youtube.com/watch?v=yuIOGfcOH0k&t=27m56s
>>>>
>>>
>>>  Now we're talking the next step :) Yea, having a GUI would be *great*
>>> (and just so we're clear: with GUI I mean a web page :P)
>>>
>>>
>>>  And maybe AST database optimized for fast predicate matches :)
>>>
>>
>>  For small projects this might be interesting - for us the question is
>> how that would scale - we've found parsing the C++ code to be actually an
>> interesting way to scale the AST, for the small price of needing up 3-4
>> seconds per TU (on average). Denormalizing the AST itself produces a huge
>> amount of data, and denormalizing even more seems like a non-starter.
>>
>>  Thoughts?
>>
>>
>>  It depends on how much you would like to scale. And yes, it also depends
>> on project sizes.
>> For instance, if required scaling is task per TU - it is one case.
>>
>
>  Perhaps I need to expand on what I mean here:
> Imagine you have on the order of 100MLOC.
>  If you want an "AST database" for predicate matches, the question is what
> indexes you create. If you basically want to create an extra index per
> "matcher", the denormalization takes too much data. If you don't create an
> index per matcher, how do you efficiently evaluate matchers?
>
>
> I understood that part of previous message.
> My point was, that if you have 1k translation units and need to scale up
> to 100k parallel tasks, then it is obvious that "task per TU" is not
> sufficient, and need to use another approach (maybe pre-parse and split
> AST).
>

I don't understand the point you're trying to make here yet :)
Are you talking about having the same (parametrized) task done 100k times
in parallel (like: find all references to X done by many engineeres), or
something else? How would a pre-parsed AST help? Perhaps you can expand on
the "obvious" part ;)

Cheers,
/Manuel


>
> Best Regards,
> Evgeny
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120619/98e662dc/attachment.html>


More information about the cfe-dev mailing list