[cfe-dev] [clang-tidy] Some possible contributions

Piotr Dziwinski via cfe-dev cfe-dev at lists.llvm.org
Sun Dec 13 10:15:58 PST 2015


I want to let you know that I'm still here and still working on 
clang-tidy checker for localizing variables.
Sorry for the long delay, the time flies so fast.

Anyway, I have some good and some bad news.

The good news is that I got my checker to work correctly when it comes 
to handling nested scopes and control statements, including all the 
corner cases that I predicted.
I even made provisions for things like introducing localized for loop 
counters and inserting braces where necessary in switch-case blocks.

The bad news is that I ran into two issues that I couldn't resolve on my 
own, both of which are critical for the checker to be useful in real 
production code.

Issue #1. Handling multiple variables declared in one declaration statement.

    The case of moving a single declaration is easy enough - I can just
    cut the whole declaration as written in program source and paste it
    somewhere else.
    But if I have to cut out only selected parts of a multiple
    declaration and move them, I need some way of parsing the
    declaration to separate parts of the statement.
    Let's take for example:

        long int a, *b, c[10];

    I want to obtain the necessary source locations from AST so that I
    can parse this declaration into four component strings:
       - type identifier: "long int"
       - first variable declaration: "a"
       - second variable declaration: "*b"
       - third variable declaration: "c[10]"

    Once I have that, it will be easy to independently cut out code
    between identified source locations and create insertions by putting
    together the strings from source code.

    I tried different methods of achieving that but given the interface
    of DeclStmt and Decl, I cannot find a way to do that reliably for
    all possible cases.
    I can obtain source locations for subsequent variable declarations,
    but these always point to beginning of variable name as it appears
    in code.
    What I need is actual source location separating "long int" from "a"
    and further locations of separating commas between declarations.
    Theoretically, I could use Lexer to find comma tokens, but there are
    also other contexts where comma may appear, such list of arguments
    of function pointer.

Issue #2. Merging multiple FixIt hint insertions in the same location.

    For example, given:

        int a; int b;
        // ...
        call(a, b); // first use of "a" and "b"

    My checker would generate two diagnostics:

       - "variable "a" declaration can be moved closer to its point of
    use" with the following FixIt hint:

        int b;
        // ...
        int a; call(a, b);

       - "variable "b" declaration can be moved closer to its point of
    use" with the following FixIt hint:

        int a;
        // ...
        int b; call(a, b);

    The end result of applying these FixIt hints is:

        // ...
        int b; call(a, b);

    So the removals are handled correctly, but in case of insertions,
    the second one overrides the first one. What I want is to have both
    insertions present, one after the other.

    So far, I have worked around this problem by clumping together such
    problematic FixIt hints to one of the diagnostics, so that they are
    applied all at once.
    However, this can be quite confusing for the user who would see
    seemingly unrelated FixIt hints under a diagnostic referring to only
    one variable.
    Needless to say, this also makes the code much more complicated than
    it needs to be as I have to store information first and emit
    diagnostics later.

    I don't know how to resolve this, as this is a more a problem with
    applying FixIt hints in general.
    Perhaps this is actually the intended behavior in some cases. Even
    so, I would like to have a way to address my use case, without
    impacting other use cases.

I would appreciate any pointers of how to tackle these two issues.
I'm afraid that without solving them, my checker will remain in 
"always-beta" state.

Meanwhile, if you are interested in current version of the code, you can 
check out my Github fork:

Best regards,
Piotr Dziwinski

On 22/10/15 17:21, Richard via cfe-dev wrote:
> [Please reply *only* to the list and do not include my email directly
> in the To: or Cc: of your reply; otherwise I will not see your reply.
> Thanks.]
> In article <5626C26F.4040009 at gmail.com>,
>      Piotr Dziwinski via cfe-dev <cfe-dev at lists.llvm.org> writes:
>> It still needs work on corner cases and some hard cases like switch
>> statements, which I've left out for now. And of course more tests. I
>> will try to cover as much as possible in unit tests, but it's just
>> impossible to test everything. The real verification will be running it
>> on some production code. I will look at the project you gave as example,
>> too.
> Areas where I have found refactoring tools to fail are with "exotic"
> types like pointer to function, pointer to member data and pointer to
> member function.  Sometimes things freak out with unions, too.
> I looked at your test file on github and it's already covering the
> idea of multiple variables declared in a single statement where one of
> them is localized later in the scope, so that looks good.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20151213/4f4e113f/attachment.html>

More information about the cfe-dev mailing list