[cfe-dev] code conversion challenge

Manuel Klimek klimek at google.com
Wed Feb 29 08:29:27 PST 2012


On Tue, Feb 28, 2012 at 10:12 AM, Philip Ashmore
<contact at philipashmore.com> wrote:
> On 28/02/12 17:17, Sean Silva wrote:
>> It looks like what you want to do is to run a RecursiveASTVisitor over
>> the AST and essentially cherry-pick certain information off of it. It
>> may be a lot of work, but I think you could do it.
>>
>> Oh, except for the comment. That would be *much* more difficult. In
>> your example you seem to be including the comment as though it were a
>> statement in the body. How would your representation (which reminds me
>> of Prolog btw) represent:
>>
>>   int main(int argc, char * argv[])
>>   {
>>     return /* all ok */ 0;
>>   }
> Return(Comment("all ok"), Int(0i32))
> You could also add File("myfile.sbt"), Line(22) and Column(32) anywhere
> to track the source file.
> It all comes down to what you want to process and how.
>>
>> or
>>
>>   int main(int argc, char * argv[])
>>   {
>>     doSomethingWithALotOfArgs(argv[0], argv, argv+argc,
>> /*verbose=*/false);
>>     return 0;
>>   }
>>
>> Code like that last example is extremely common.
>>
>> For a more pathological example, consider
>>
>>   #define X(a,b) a##b
>>   int X(ma,/*pure evil*/in)(int argc, char * argv[])
>>   {
>>     doSomethingWithALotOfArgs(argv[0], argv, argv+argc,
>> /*verbose=*/false);
>>     return 0;
>>   }
> , Macro
>   ( name(X)
>   , Parameters(a, b)
>   , Body
>     ( Return(Concat(a, b))
>     )
>   )
> , Function
>   ( Name(X(ma, Comment("pure evil"), in))
>   , Body
>     ( Call(doSomethingWithALotOfArgs, Index(argv, 0), Add(argv, argc),
> Comment("verbose"), Bool(false))
>
> My parser doesn't distinguish between "built-in" symbols and those used
> in the code.
>> The conclusion is that to actually be useful, your representation
>> would want to make certain things "off limits" or purposefully not
>> representable. It's up to you to draw the line. Once you have that
>> line, you can then get what you want in a pretty straightforward way
>> from clang.
> I think the binary representation would be really useful as a
> pre-compiled header format where even macro expansion is
> deferred.
>
> I forgot to mention that the format is in-place-editable and with a
> snapshotting filesystem (e.g. fuse) you could
> efficiently modify it in place for one source file, make another
> snapshot and edit that, and then throw the snapshots
> away.
>
> It's going to be part of my v3c-storyboard SourceForge project, and
> being able to process C/C++ into this format would
> be a big plus.
>
> Things like extracting function prototypes, automatically determining
> the required include files, source translation
> all become a lot easier this way, as the library has a ridiculously
> simple C/C++ api - it's all about calls, symbols and
> literals.

Having done a few real world C++ code transformations recently, I
don't buy that a stripped down format will help a lot. Most of the
things you propose would need very C++ specific implementations - why
not just write tools against the clang AST for them?

Cheers,
/Manuel

>
>> --Sean Silva
>>
>> On Tue, Feb 28, 2012 at 2:21 AM, Philip Ashmore
>> <contact at philipashmore.com <mailto:contact at philipashmore.com>> wrote:
>>
>>     On 28/02/12 07:01, Philip Ashmore wrote:
>>     > Hi there.
>>     >
>>     > Here's the problem:
>>     > Given the source file with this content:
>>     >
>>     >     int main(int argc, char * argv[])
>>     >     {
>>     >       /* all ok */
>>     >       return 0;
>>     >     }
>>     >
>>     > I want to convert it into something like this:
>>     >
>>     >     module
>>     >     ( function
>>     >       ( Name("main")
>>     >       , Returns("int")
>>     >       , Parameters
>>     >         ( Parameter(Type("int"), Name("argc")
>>     >         , Parameter(Type(Array(Pointer("char"), Size()),
>>     Name("argv")
>>     >         )
>>     >       , Body
>>     >         ( Comment("all ok")
>>     >         , Return(Int(0i32))
>>     >         )
>>     >       )
>>     >     )
>>     >
>>     > This is a description format that has a binary representation
>>     that allows for
>>     > easy depth-first and breadth-first traversal.
>>     >
>>     > With it one can describe C/C++, make files, pre-processor macros
>>     etc. - the
>>     > reader supplies the meaning to the "calls" like "module".
>>     >
>>     > With it I hope to be able to describe things like interfaces and
>>     be able to
>>     > automate the glue that allows it to be called from scripting
>>     languages,
>>     > and much more.
>>     >
>>     > I haven't even given this format a name, but I can convert the
>>     text above to
>>     > and from the binary representation.
>>     >
>>     > So that's the challenge - any takers?
>>     >
>>     > Regards,
>>     > Philip Ashmore
>>     >
>>     > _______________________________________________
>>     > cfe-dev mailing list
>>     > cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
>>     > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>     OK, maybe not this exact example - the parameters are missing ')', but
>>     you get the idea.
>>
>>     Philip
>>     _______________________________________________
>>     cfe-dev mailing list
>>     cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
>>     http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev




More information about the cfe-dev mailing list