[cfe-dev] code conversion challenge

Philip Ashmore contact at philipashmore.com
Tue Feb 28 10:12:45 PST 2012


On 28/02/12 17:17, Sean Silva wrote:
> It looks like what you want to do is to run a RecursiveASTVisitor over 
> the AST and essentially cherry-pick certain information off of it. It 
> may be a lot of work, but I think you could do it.
>
> Oh, except for the comment. That would be *much* more difficult. In 
> your example you seem to be including the comment as though it were a 
> statement in the body. How would your representation (which reminds me 
> of Prolog btw) represent:
>
>   int main(int argc, char * argv[])
>   {
>     return /* all ok */ 0;
>   }
Return(Comment("all ok"), Int(0i32))
You could also add File("myfile.sbt"), Line(22) and Column(32) anywhere 
to track the source file.
It all comes down to what you want to process and how.
>
> or
>
>   int main(int argc, char * argv[])
>   {
>     doSomethingWithALotOfArgs(argv[0], argv, argv+argc, 
> /*verbose=*/false);
>     return 0;
>   }
>
> Code like that last example is extremely common.
>
> For a more pathological example, consider
>
>   #define X(a,b) a##b
>   int X(ma,/*pure evil*/in)(int argc, char * argv[])
>   {
>     doSomethingWithALotOfArgs(argv[0], argv, argv+argc, 
> /*verbose=*/false);
>     return 0;
>   }
, Macro
   ( name(X)
   , Parameters(a, b)
   , Body
     ( Return(Concat(a, b))
     )
   )
, Function
   ( Name(X(ma, Comment("pure evil"), in))
   , Body
     ( Call(doSomethingWithALotOfArgs, Index(argv, 0), Add(argv, argc), 
Comment("verbose"), Bool(false))

My parser doesn't distinguish between "built-in" symbols and those used 
in the code.
> The conclusion is that to actually be useful, your representation 
> would want to make certain things "off limits" or purposefully not 
> representable. It's up to you to draw the line. Once you have that 
> line, you can then get what you want in a pretty straightforward way 
> from clang.
I think the binary representation would be really useful as a 
pre-compiled header format where even macro expansion is
deferred.

I forgot to mention that the format is in-place-editable and with a 
snapshotting filesystem (e.g. fuse) you could
efficiently modify it in place for one source file, make another 
snapshot and edit that, and then throw the snapshots
away.

It's going to be part of my v3c-storyboard SourceForge project, and 
being able to process C/C++ into this format would
be a big plus.

Things like extracting function prototypes, automatically determining 
the required include files, source translation
all become a lot easier this way, as the library has a ridiculously 
simple C/C++ api - it's all about calls, symbols and
literals.

> --Sean Silva
>
> On Tue, Feb 28, 2012 at 2:21 AM, Philip Ashmore 
> <contact at philipashmore.com <mailto:contact at philipashmore.com>> wrote:
>
>     On 28/02/12 07:01, Philip Ashmore wrote:
>     > Hi there.
>     >
>     > Here's the problem:
>     > Given the source file with this content:
>     >
>     >     int main(int argc, char * argv[])
>     >     {
>     >       /* all ok */
>     >       return 0;
>     >     }
>     >
>     > I want to convert it into something like this:
>     >
>     >     module
>     >     ( function
>     >       ( Name("main")
>     >       , Returns("int")
>     >       , Parameters
>     >         ( Parameter(Type("int"), Name("argc")
>     >         , Parameter(Type(Array(Pointer("char"), Size()),
>     Name("argv")
>     >         )
>     >       , Body
>     >         ( Comment("all ok")
>     >         , Return(Int(0i32))
>     >         )
>     >       )
>     >     )
>     >
>     > This is a description format that has a binary representation
>     that allows for
>     > easy depth-first and breadth-first traversal.
>     >
>     > With it one can describe C/C++, make files, pre-processor macros
>     etc. - the
>     > reader supplies the meaning to the "calls" like "module".
>     >
>     > With it I hope to be able to describe things like interfaces and
>     be able to
>     > automate the glue that allows it to be called from scripting
>     languages,
>     > and much more.
>     >
>     > I haven't even given this format a name, but I can convert the
>     text above to
>     > and from the binary representation.
>     >
>     > So that's the challenge - any takers?
>     >
>     > Regards,
>     > Philip Ashmore
>     >
>     > _______________________________________________
>     > cfe-dev mailing list
>     > cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
>     > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>     OK, maybe not this exact example - the parameters are missing ')', but
>     you get the idea.
>
>     Philip
>     _______________________________________________
>     cfe-dev mailing list
>     cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
>     http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>



More information about the cfe-dev mailing list