[cfe-dev] libclang interface for comments AST
Dmitri Gribenko
gribozavr at gmail.com
Tue Jul 17 13:34:23 PDT 2012
On Tue, Jul 17, 2012 at 11:35 AM, Douglas Gregor <dgregor at apple.com> wrote:
>
> On Jul 17, 2012, at 10:19 AM, Dmitri Gribenko <gribozavr at gmail.com> wrote:
>
>> Hi Doug!
>>
>> Thank you for looking at this!
>>
>> On Tue, Jul 17, 2012 at 9:54 AM, Douglas Gregor <dgregor at apple.com> wrote:
>>> On Jul 16, 2012, at 12:48 PM, Dmitri Gribenko wrote:
>>
>>> +CINDEX_LINKAGE
>>> +CXString clang_InlineCommandComment_getCommandName(CXComment Comment);
>>> <snip>
>>> +CINDEX_LINKAGE
>>> +CXString clang_BlockCommandComment_getCommandName(CXComment Comment);
>>>
>>> Why have both of these, rather than simply
>>>
>>> CINDEX_LINKAGE
>>> CXString clang_Comment_getCommandName(CXComment Comment);
>>>
>>> ?
>>>
>>> Similar comment for clang_*Comment_getNumArgs/clang_*Comment_getArgText
>>
>> So that this function would internally determine the type of the AST
>> node and "just do the right thing"? This, combined with ignoring AST
>> node type mismatches and making up "safe" values, might create some
>> issues with understanding the API and lead to API misuse.
>
> Perhaps.
>
>>> Also, there are a number of clang_*Comment_getText() calls.
>>
>> For example, someone might apply this clang_Comment_getText() call to
>> a FullComment and expect it to return a comment represented as plain
>> text (somehow, with or without HTML and Doxygen markup, word-wrapped
>> to fit 80 cols or not -- expectations can be very different). But,
>> unless we implement this AST traversal, we will just return an empty
>> value because we will call getText() only for AST nodes that have
>> getText() call themselves.
>>
>> Or implementing getText() to work universally is the way to go?
>
> I generally prefer fewer entry points, so that clients that have to switch() on all of the enum values can at the very least combine cases for obvious things (say, if they don't care about adding special formatting for HTML or other inlines). The general philosophy I have is that simple clients should be able to do simple things, but the information should be there to do more complicated things.
>
> But, I completely see your point about this potentially causing confusion with the API, and we have seen some confusion with other libclang APIs that skew toward fewer, more general entry points. We can try it your way :)
Don't get me wrong, I see how a general getText() call is a useful
API. But I think it would be better if it would be a separate API
call (for example, getAsPlainText()) that is designed (and expected by
users) to do all the required magic. I see these various *_getText()
APIs as property getters for specific node kinds, low-level building
blocks without any magic to facilitate building other tools on top of
libclang (for example, implement some other way to convert a comment
to plain text).
>>> Perhaps there should just be a single clang_Comment_getText() that works for the various comment kinds? The main outlier is
>>>
>>> +/**
>>> + * \param Comment a \c CXComment_VerbatimBlockCommand AST node.
>>> + *
>>> + * \param LineIdx verbatim text line index (zero-based).
>>> + *
>>> + * \returns text contained in the specified line.
>>> + */
>>> +CINDEX_LINKAGE
>>> +CXString clang_VerbatimBlockComment_getText(CXComment Comment,
>>> + unsigned LineIdx);
>>>
>>> and I'm wondering why one doesn't simply get the CXComment_VerbatimBlockLine via clang_Comment_getNumChildren/clang_Comment_getChild?
>>
>> It is just a convenience function. It can be easily removed.
>
> Okay. I'd prefer removal, but would also be fine with commenting that these CXComment_VerbatimBlockLine comments will also be part of the children of the node, so users don't think they have to traverse both.
Removed to reduce confusion.
> Aside: I notice that most of your comments for these APIs don't have any 'brief' paragraph; they simply have \param and \returns clauses.
\brief would be a copy of \returns -- these functions are just getters
for various node properties. I can rephrase \returns as \brief and
remove \returns.
>>> This group of functions:
>>>
>>> +CINDEX_LINKAGE CXString clang_HTMLTagComment_getTagName(CXComment Comment);
>>> +CINDEX_LINKAGE
>>> +unsigned clang_HTMLStartTagComment_isSelfClosing(CXComment Comment);
>>> +CINDEX_LINKAGE unsigned clang_HTMLStartTag_getNumAttrs(CXComment Comment);
>>> +CINDEX_LINKAGE
>>> +CXString clang_HTMLStartTag_getAttrName(CXComment Comment, unsigned AttrIdx);
>>> +CINDEX_LINKAGE
>>> +CXString clang_HTMLStartTag_getAttrValue(CXComment Comment, unsigned AttrIdx);
>>>
>>> implies that a client will have to reconstruct HTML tags on its own, which seems like it's going to cause a lot of redundant (and probably wrong) code in clients. Can the aforementioned clang_Comment_getText() do the right thing for CXComment_HTMLStartTag and CXComment_HTMLEndTag nodes, by formatting the HTML tag appropriately? That would simplify a number of clients.
>>
>> It surely makes sense to have such API, but see my concerns about
>> general getText() API.
>
>
> I'd at least like to have this API available.
Of course, just with a separate function name.
Dmitri
--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr at gmail.com>*/
More information about the cfe-dev
mailing list