[cfe-dev] For loop conversion (was: C++11 migration tools)

Sam Panzer panzer at google.com
Tue Jul 10 21:03:05 PDT 2012


I'm trying to port my code to take advantage of matchers, now that they are
in mainline. Some of the work I want to do involves semantic analysis of
the results (i.e. in the callback). What would be the best way to get a
Sema or CompilerInstance out of either RefactoringTool or MatchResult? I'm
currently playing with changing MatchASTConsumer to inherit from
SemaConsumer, so that MatchFinder can track a Sema object the same way it
does an ASTContext.

Thanks!


On Mon, Jul 2, 2012 at 7:22 AM, Manuel Klimek <klimek at google.com> wrote:

> On Mon, Jul 2, 2012 at 4:16 PM, Sam Panzer <panzer at google.com> wrote:
>
>> On Sun, Jul 1, 2012 at 10:45 PM, Manuel Klimek <klimek at google.com> wrote:
>>
>>> On Fri, Jun 29, 2012 at 8:17 PM, Sam Panzer <panzer at google.com> wrote:
>>>
>>>> Thanks for the input!
>>>>
>>>> Tooling/Refactoring is definitely the right way to go - dumping to
>>>> stdout was just a holdover from the example on LibTooling. I'll change it
>>>> once I figure out how it works - and a clean way to arrange the tests.
>>>>
>>>> As for the use of matchers vs. visitors, I decided to use a visitor
>>>> because this is a relatively complex transformation. I would happily use
>>>> matchers if I thought I could - and I think that some other c++11
>>>> migrations can easily be written with matchers - but I think the for loop
>>>>
>>>
>>> I'm not claiming that the matchers needed to match those constructs are
>>> all already written - but if we write the questions you need to ask into
>>> matchers, other people who want to match similar things can reuse them,
>>> thus amplifying the impact of the code you write ;)
>>>
>>>
>>>> checks need some features that matchers don't have (correct me if I'm
>>>> wrong!). For example, the check for statically allocated array-based loops
>>>> does this:
>>>> Given a for loop, determine if:
>>>>  - The increment statement increments (via ++) exactly one integral
>>>> index variable, such that the variable is declared and initialized to zero
>>>> in the init portion of the loop, and that this variable's value is a
>>>> compared (via <, > or !=) to a nonnegative compile-time constant N in the
>>>> compare portion.
>>>>  - The index variable is only ever used in an ArrayIndexExpession
>>>> indexing a single, statically allocated array A.
>>>>  - The array A has exactly N elements.
>>>>  - Additionally, if the ArrayIndexExpression A[index] is ever assigned,
>>>> passed to a function or copied as a non-const reference, or its address
>>>> taken with & (I still need to add a check for calls to non-const member
>>>> functions), the loop variable in the converted version needs to be a
>>>> non-const reference so that the value will be correctly updated (this step
>>>> adds the most complexity).
>>>>
>>>
>>> ... and the matcher I would want to write for this looks something like
>>> that:
>>> ForLoop(
>>>   HasInitialization(Declaration(Id("loopvar", HasType(IsIntegral())))),
>>>   HasCondition(BinaryOperator(
>>>     HasAnyOperand(DeclarationReference(Id("condref", To(Variable())))),
>>>     HasAnyOperand(IntegerLiteral()))),
>>>
>>> HasIncrement(UnaryOperator(HasUnaryOperand(DeclarationReference(Id("incref",
>>> To(Variable()))))), ...),
>>> )
>>>
>>> In general, the complex stuff can stay complex, but the simple stuff
>>> shouldn't be lots of code.
>>>
>>
>> Good point - and this is much easier to read than the equivalent code I
>> had written.
>>
>>
>>>
>>>
>>>> The other types of loop (iterator-based, and array-like container) are
>>>> more complicated to detect, since there are more permitted ways to define
>>>> and use the index/iterators. What makes this difficult to do entirely with
>>>> matchers is the number of back- and cross-references, as well as the
>>>> different local behaviors based on semantic properties. On the other hand,
>>>> if there were some kind of backreference-enabled matcher that
>>>>
>>>
>>> The way to handle the back-refs is to bind the nodes you want, and then
>>> pull them out and compare them in the callback.
>>>
>>
>> I see - make the matcher slightly more general, then filter the results,
>> perhaps with a visitor.
>>
>>
>>>
>>> Thoughts?
>>> /Manuel
>>>
>>
>> This sounds like it would make at least half the work much easier, so I
>> think it would definitely be worth it to try switching to a matcher-based
>> solution. When are matchers supposed to hit mainline (or some extra
>> cloneable repo) :) ?
>>
>
> Matchers are currently in
> ^cfe/branches/tooling/include/clang/ASTMatchers/...
>
> I'm currently working on renaming them to camelCase from CamelCase;
> there's a Tool to help with the conversion though, so no problem in
> starting now ...
>
> Cheers,
> /Manuel
>
>
>>
>> -Sam
>>
>>
>>>
>>>
>>>> allowed me to locate all matches in a given Stmt, it could be *much*
>>>> easier to express some parts of the logic, such as the first step in the
>>>> above list. I also suspect that a single-Stmt matcher would better way to
>>>> handle the last step; currently I track  whether the visitor is looking at
>>>> a statement or expression which fits any of the const-disqualifying
>>>> conditions, and a note is made if I run into A[index].
>>>>
>>>> Does this make the use case clearer? I don't really see a better way to
>>>> approach this problem, but you guys know the available toolkit far better
>>>> than I do.
>>>>
>>>> On Fri, Jun 29, 2012 at 2:48 AM, Manuel Klimek <klimek at google.com>wrote:
>>>>
>>>>> On Fri, Jun 29, 2012 at 11:45 AM, Chandler Carruth <
>>>>> chandlerc at google.com> wrote:
>>>>>
>>>>>> I tend to agree that this should use the Tooling/Refactoring stuff.
>>>>>>
>>>>>> However, I'm curious about the best way to structure the location of
>>>>>> migration candidates: AST matchers vs. visitor.
>>>>>>
>>>>>> I can almost see the visitor pattern working really well here as each
>>>>>> different construct can have a pile of migration logic dropped in.... But
>>>>>> if there is a need to connect dots between more distant constructs, that
>>>>>> wouldn't work so well.... Not at all sure what would be best here.
>>>>>>
>>>>>
>>>>> I've used a combination before - use matchers for the stuff where they
>>>>> work well, then write a very small easy-to-understand visitor if you need
>>>>> more. I think that brings down code size by quite a bit - obviously just a
>>>>> gut feeling here.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 29, 2012 at 1:37 AM, Manuel Klimek <klimek at google.com>wrote:
>>>>>>
>>>>>>> On Fri, Jun 29, 2012 at 4:06 AM, Sam Panzer <panzer at google.com>wrote:
>>>>>>>
>>>>>>>> In case anyone wanted to take a look, the attached patch includes
>>>>>>>> the tool I've been working on. I create a new binary, c++migrate, which
>>>>>>>> attempts to convert for loops in the source files given to it. Most of my
>>>>>>>> focus has been on the FrontedAction, so I skirted all of the issues
>>>>>>>> mentioned above by keeping the frontend interaction minimal (i.e. I just
>>>>>>>> call Tooling::ClangTool::run), and the changes are just reported on
>>>>>>>> standard output, if there are any to be made.
>>>>>>>>
>>>>>>>> The tool can currently convert for loops that range over (1)
>>>>>>>> statically allocated arrays, and (2) Clang-style iterator-based loops (with
>>>>>>>> begin and end iterators defined). All loop variables need to be declared
>>>>>>>> within the loop's initialization step in order for it to be converted,
>>>>>>>> though this requirement can potentially be eliminated. I'm working on
>>>>>>>> converting iterator-based loops that call someContainer.end() on each
>>>>>>>> iteration, since they're probably the common case in many codebases.
>>>>>>>>
>>>>>>>> Just for fun, I ran the tool over the 41 .cpp files in lib/Sema,
>>>>>>>> and my tool found 71 convertible loops in 17 files. There is plenty more
>>>>>>>> work to go, because it clearly missed some easy ones.
>>>>>>>>
>>>>>>>> Any input or feedback is welcome!
>>>>>>>>
>>>>>>>
>>>>>>> High-level observations:
>>>>>>> 1. the handling of the rewrites; any reason not to use the
>>>>>>> Tooling/Refactoring stuff? Currently in the patch it looks to me like the
>>>>>>> files are not rewritten, but dumped to stdout
>>>>>>> 2. is the reason not to use the matchers here that they're not
>>>>>>> landed in mainline yet?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> /Manuel
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> -Sam
>>>>>>>>
>>>>>>>> On Thu, Jun 28, 2012 at 10:50 AM, Sam Panzer <panzer at google.com>wrote:
>>>>>>>>
>>>>>>>>> I'm that intern :)
>>>>>>>>>
>>>>>>>>> -Sam
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jun 27, 2012 at 9:48 PM, John Wiegley <johnw at boostpro.com>wrote:
>>>>>>>>>
>>>>>>>>>> >>>>> Sam Panzer <panzer at google.com> writes:
>>>>>>>>>>
>>>>>>>>>> > In particular, I am working on a tool to convert existing C++
>>>>>>>>>> for loops to
>>>>>>>>>> > take advantage of the new C++11 range-based syntax. I can
>>>>>>>>>> imagine similar
>>>>>>>>>> > tools to replace const with constexpr, macro hacks with
>>>>>>>>>> static_assert, and
>>>>>>>>>> > potentially other common refactorings.
>>>>>>>>>>
>>>>>>>>>> > Thoughts? Suggestions?
>>>>>>>>>>
>>>>>>>>>> You really must watch this presentation, if you haven't already:
>>>>>>>>>>
>>>>>>>>>>     http://www.youtube.com/watch?v=yuIOGfcOH0k
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> John Wiegley
>>>>>>>>>> BoostPro Computing
>>>>>>>>>> http://www.boostpro.com
>>>>>>>>>> _______________________________________________
>>>>>>>>>> cfe-dev mailing list
>>>>>>>>>> cfe-dev at cs.uiuc.edu
>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> cfe-dev mailing list
>>>>>>>> cfe-dev at cs.uiuc.edu
>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> cfe-dev mailing list
>>>>>>> cfe-dev at cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120710/0663e8fa/attachment.html>


More information about the cfe-dev mailing list