[cfe-dev] [clang-tidy] Some possible contributions

Piotr Dziwinski via cfe-dev cfe-dev at lists.llvm.org
Mon Dec 14 14:32:10 PST 2015


On 14/12/15 20:42, Richard via cfe-dev wrote:
> [Please reply *only* to the list and do not include my email directly
> in the To: or Cc: of your reply; otherwise I will not see your reply.
> Thanks.]
>
> In article <CAOsfVvk=ZYrJF+N9R=SqyT-zzVZMguxKQur88X9=-Bs3kWUJ-Q at mail.gmail.com>,
>      Manuel Klimek via cfe-dev <cfe-dev at lists.llvm.org> writes:
>
>> On Sun, Dec 13, 2015 at 6:15 PM Piotr Dziwinski via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>> long int a, *b, c[10];
>>>
>>> I want to obtain the necessary source locations from AST so that I can
>>> parse this declaration into four component strings:
>>>    - type identifier: "long int"
>>>    - first variable declaration: "a"
>>>    - second variable declaration: "*b"
>>>    - third variable declaration: "c[10]"
>> I believe you'll need to whip out the lexer for this currently.
> IMO, the proper place to fix this is in the AST itself.  I would like
> to see the AST node should provide an iterator API that allows you to
> walk over all the identifiers (SourceLocation begin/end for each
> identifier) in a single declaration and obtain their actual types:
>
> a -> long int
> b -> long int *
> c -> long int[10]
>
> This is really important information for working with declarations and
> should be available from the tooling infrastructure.
>
> At one point I tried to look into how declarations were parsed, but
> not being that familiar with the parser code, I couldn't quickly find
> where such information could be stashed off for latery querying.  It
> seems to me that during parsing we must already have this information,
> but it isn't directly encoded in the resulting AST.
Well, in a way, we do have this information, as part of QualType and 
Type components of AST which I can access for these declarations.
This means that I can generate the necessary insertions, like "long int 
*b;", using QualType::getAsString() (*) and this is what I actually do 
in the current version of the checker.

But the crux of the problem is the correct generation of removals in 
FixIt hints.
An example would be when I need to move only the declaration of "b" from 
this multiple declaration statement, leaving declarations of "a" and "c" 
where they are.
This is where I need to know the locations where the cuts are to be made.
Knowing the actual type from QualType may help in that, but I still need 
some help from either the AST or Lexer to extract these locations.

(*) - note this doesn't always work as we want. From what I understand, 
this string is meant to be displayed as a prettified version of type 
name in compiler diagnostic,
not to regenerate the actual code as it was typed in the program.
Also, this doesn't help in case of "exotic" types such as arrays or 
function pointers, as I would have to insert variable name somewhere in 
the middle of it.

Ideally, I would like a mechanism to somehow parse and use the original 
code as written in the source, cutting out parts of it and moving them 
where necessary, keeping all information intact.

Best regards,
Piotr Dziwinski


More information about the cfe-dev mailing list