[cfe-dev] Debugging Information; exploiting comments
Ted Kremenek
kremenek at apple.com
Fri Nov 9 17:10:28 PST 2007
On Nov 9, 2007, at 4:57 PM, Ted Kremenek wrote:
>> I am investigating adding debugging information to clang. Do you
>> think it to
>> soon? (I would like to add a -g flag to the drivers and add
>> conditional llvm
>> debug intrinsic emission in CodeGen).
>> Do you have already some idea on the question? (I studied llvm-gcc
>> debug
>> generation and would add something similar)
I'm not currently involved with the efforts on the CodeGen module.
Chris, Devang?
>> Another idea I had is implementing a documentation tools (like
>> doxygen)
>> using clang. The problem is that the existing framework doesn't
>> permit
>> analyzing the commentary and parsing in the same pass (At least I
>> don't see
>> how).
You are indeed correct; currently the parser discard comments when the
ASTs are built. We have discussed the technical hurdles of doing
"appropriate" handling of comments, as they could be used for doxygen-
like tools, as well as for annotations that can be used by other
analysis tools.
The main challenge is that comments can appear literally anywhere, and
how they conceptually bind to entities in the program (be they
declarations or actual statements and expressions) is really specific
to the application that uses the comments (e.g. doxygen).
>> I would need to add some sort of callback in the lexer or
>> preprocesseur for processing the comment (we can't parse the comment
>> token,
>> it would be an impossible task). The callback would store the
>> comment and
>> when the next declaration would be parsed, the stored comment is
>> used for
>> decoration the declaration. Do you think this is a good way?
I'm not entirely certain how comments are processed by the lexer and
parser, and how easy it would be to add a callback. I believe that it
is doable, but I haven't really looked at that code. Steve, Chris?
Conceptually, if a callback mechanism for parsing comments is in place
you could then do whatever you wanted with the comments, although it
wouldn't necessarily be easy (it would depend on your application).
The ideal solution would be to separate the policy of how comments are
used (e.g. how you bind them to expressions, statements, declarations,
and so forth) with how they are parsed (or rather, how the ASTs are
built in Sema). That way a bunch of tools that process comments could
be built instead of a single ad hoc solution. We also don't want to
get into the business of people unnecessarily hacking on the Sema
module where the ASTs are built and semantically analyzed. Such hacks
would inevitably cause tools built on such hacks to diverge from the
functionality available in "mainline" clang.
>> Another
>> way
>> would be to add some sort of filter between the lexer and the parser
>> which
>> would process and delete the comment token as they come, but it would
>> probably be slower and on the critical path (not sure the lexing/
>> parsing
>> part is time critical since the semantical analysis will eventually
>> probably
>> be a lot slower).
I'm not certain if I completely understand this solution. At the end
of the day you still need to bind comments (or whatever data you
extract from them) to ASTs (decls, etc.). Since the parser/lexer has
no notion of ASTs, you almost necessarily have to put some of the key
logic at a higher level (e.g., the Sema module). IIRC, essentially
the parser and lexer just build tokens and process the C grammar; Sema
actually builds the ASTs based on an interface between it an the parser.
More information about the cfe-dev
mailing list