[cfe-dev] For loop conversion (was: C++11 migration tools)

Manuel Klimek klimek at google.com
Mon Jul 2 07:22:02 PDT 2012


On Mon, Jul 2, 2012 at 4:16 PM, Sam Panzer <panzer at google.com> wrote:

> On Sun, Jul 1, 2012 at 10:45 PM, Manuel Klimek <klimek at google.com> wrote:
>
>> On Fri, Jun 29, 2012 at 8:17 PM, Sam Panzer <panzer at google.com> wrote:
>>
>>> Thanks for the input!
>>>
>>> Tooling/Refactoring is definitely the right way to go - dumping to
>>> stdout was just a holdover from the example on LibTooling. I'll change it
>>> once I figure out how it works - and a clean way to arrange the tests.
>>>
>>> As for the use of matchers vs. visitors, I decided to use a visitor
>>> because this is a relatively complex transformation. I would happily use
>>> matchers if I thought I could - and I think that some other c++11
>>> migrations can easily be written with matchers - but I think the for loop
>>>
>>
>> I'm not claiming that the matchers needed to match those constructs are
>> all already written - but if we write the questions you need to ask into
>> matchers, other people who want to match similar things can reuse them,
>> thus amplifying the impact of the code you write ;)
>>
>>
>>> checks need some features that matchers don't have (correct me if I'm
>>> wrong!). For example, the check for statically allocated array-based loops
>>> does this:
>>> Given a for loop, determine if:
>>>  - The increment statement increments (via ++) exactly one integral
>>> index variable, such that the variable is declared and initialized to zero
>>> in the init portion of the loop, and that this variable's value is a
>>> compared (via <, > or !=) to a nonnegative compile-time constant N in the
>>> compare portion.
>>>  - The index variable is only ever used in an ArrayIndexExpession
>>> indexing a single, statically allocated array A.
>>>  - The array A has exactly N elements.
>>>  - Additionally, if the ArrayIndexExpression A[index] is ever assigned,
>>> passed to a function or copied as a non-const reference, or its address
>>> taken with & (I still need to add a check for calls to non-const member
>>> functions), the loop variable in the converted version needs to be a
>>> non-const reference so that the value will be correctly updated (this step
>>> adds the most complexity).
>>>
>>
>> ... and the matcher I would want to write for this looks something like
>> that:
>> ForLoop(
>>   HasInitialization(Declaration(Id("loopvar", HasType(IsIntegral())))),
>>   HasCondition(BinaryOperator(
>>     HasAnyOperand(DeclarationReference(Id("condref", To(Variable())))),
>>     HasAnyOperand(IntegerLiteral()))),
>>
>> HasIncrement(UnaryOperator(HasUnaryOperand(DeclarationReference(Id("incref",
>> To(Variable()))))), ...),
>> )
>>
>> In general, the complex stuff can stay complex, but the simple stuff
>> shouldn't be lots of code.
>>
>
> Good point - and this is much easier to read than the equivalent code I
> had written.
>
>
>>
>>
>>> The other types of loop (iterator-based, and array-like container) are
>>> more complicated to detect, since there are more permitted ways to define
>>> and use the index/iterators. What makes this difficult to do entirely with
>>> matchers is the number of back- and cross-references, as well as the
>>> different local behaviors based on semantic properties. On the other hand,
>>> if there were some kind of backreference-enabled matcher that
>>>
>>
>> The way to handle the back-refs is to bind the nodes you want, and then
>> pull them out and compare them in the callback.
>>
>
> I see - make the matcher slightly more general, then filter the results,
> perhaps with a visitor.
>
>
>>
>> Thoughts?
>> /Manuel
>>
>
> This sounds like it would make at least half the work much easier, so I
> think it would definitely be worth it to try switching to a matcher-based
> solution. When are matchers supposed to hit mainline (or some extra
> cloneable repo) :) ?
>

Matchers are currently in
^cfe/branches/tooling/include/clang/ASTMatchers/...

I'm currently working on renaming them to camelCase from CamelCase; there's
a Tool to help with the conversion though, so no problem in starting now ...

Cheers,
/Manuel


>
> -Sam
>
>
>>
>>
>>> allowed me to locate all matches in a given Stmt, it could be *much*
>>> easier to express some parts of the logic, such as the first step in the
>>> above list. I also suspect that a single-Stmt matcher would better way to
>>> handle the last step; currently I track  whether the visitor is looking at
>>> a statement or expression which fits any of the const-disqualifying
>>> conditions, and a note is made if I run into A[index].
>>>
>>> Does this make the use case clearer? I don't really see a better way to
>>> approach this problem, but you guys know the available toolkit far better
>>> than I do.
>>>
>>> On Fri, Jun 29, 2012 at 2:48 AM, Manuel Klimek <klimek at google.com>wrote:
>>>
>>>> On Fri, Jun 29, 2012 at 11:45 AM, Chandler Carruth <
>>>> chandlerc at google.com> wrote:
>>>>
>>>>> I tend to agree that this should use the Tooling/Refactoring stuff.
>>>>>
>>>>> However, I'm curious about the best way to structure the location of
>>>>> migration candidates: AST matchers vs. visitor.
>>>>>
>>>>> I can almost see the visitor pattern working really well here as each
>>>>> different construct can have a pile of migration logic dropped in.... But
>>>>> if there is a need to connect dots between more distant constructs, that
>>>>> wouldn't work so well.... Not at all sure what would be best here.
>>>>>
>>>>
>>>> I've used a combination before - use matchers for the stuff where they
>>>> work well, then write a very small easy-to-understand visitor if you need
>>>> more. I think that brings down code size by quite a bit - obviously just a
>>>> gut feeling here.
>>>>
>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 29, 2012 at 1:37 AM, Manuel Klimek <klimek at google.com>wrote:
>>>>>
>>>>>> On Fri, Jun 29, 2012 at 4:06 AM, Sam Panzer <panzer at google.com>wrote:
>>>>>>
>>>>>>> In case anyone wanted to take a look, the attached patch includes
>>>>>>> the tool I've been working on. I create a new binary, c++migrate, which
>>>>>>> attempts to convert for loops in the source files given to it. Most of my
>>>>>>> focus has been on the FrontedAction, so I skirted all of the issues
>>>>>>> mentioned above by keeping the frontend interaction minimal (i.e. I just
>>>>>>> call Tooling::ClangTool::run), and the changes are just reported on
>>>>>>> standard output, if there are any to be made.
>>>>>>>
>>>>>>> The tool can currently convert for loops that range over (1)
>>>>>>> statically allocated arrays, and (2) Clang-style iterator-based loops (with
>>>>>>> begin and end iterators defined). All loop variables need to be declared
>>>>>>> within the loop's initialization step in order for it to be converted,
>>>>>>> though this requirement can potentially be eliminated. I'm working on
>>>>>>> converting iterator-based loops that call someContainer.end() on each
>>>>>>> iteration, since they're probably the common case in many codebases.
>>>>>>>
>>>>>>> Just for fun, I ran the tool over the 41 .cpp files in lib/Sema, and
>>>>>>> my tool found 71 convertible loops in 17 files. There is plenty more work
>>>>>>> to go, because it clearly missed some easy ones.
>>>>>>>
>>>>>>> Any input or feedback is welcome!
>>>>>>>
>>>>>>
>>>>>> High-level observations:
>>>>>> 1. the handling of the rewrites; any reason not to use the
>>>>>> Tooling/Refactoring stuff? Currently in the patch it looks to me like the
>>>>>> files are not rewritten, but dumped to stdout
>>>>>> 2. is the reason not to use the matchers here that they're not landed
>>>>>> in mainline yet?
>>>>>>
>>>>>> Cheers,
>>>>>> /Manuel
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> -Sam
>>>>>>>
>>>>>>> On Thu, Jun 28, 2012 at 10:50 AM, Sam Panzer <panzer at google.com>wrote:
>>>>>>>
>>>>>>>> I'm that intern :)
>>>>>>>>
>>>>>>>> -Sam
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jun 27, 2012 at 9:48 PM, John Wiegley <johnw at boostpro.com>wrote:
>>>>>>>>
>>>>>>>>> >>>>> Sam Panzer <panzer at google.com> writes:
>>>>>>>>>
>>>>>>>>> > In particular, I am working on a tool to convert existing C++
>>>>>>>>> for loops to
>>>>>>>>> > take advantage of the new C++11 range-based syntax. I can
>>>>>>>>> imagine similar
>>>>>>>>> > tools to replace const with constexpr, macro hacks with
>>>>>>>>> static_assert, and
>>>>>>>>> > potentially other common refactorings.
>>>>>>>>>
>>>>>>>>> > Thoughts? Suggestions?
>>>>>>>>>
>>>>>>>>> You really must watch this presentation, if you haven't already:
>>>>>>>>>
>>>>>>>>>     http://www.youtube.com/watch?v=yuIOGfcOH0k
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> John Wiegley
>>>>>>>>> BoostPro Computing
>>>>>>>>> http://www.boostpro.com
>>>>>>>>> _______________________________________________
>>>>>>>>> cfe-dev mailing list
>>>>>>>>> cfe-dev at cs.uiuc.edu
>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> cfe-dev mailing list
>>>>>>> cfe-dev at cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> cfe-dev mailing list
>>>>>> cfe-dev at cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120702/0f40fa0a/attachment.html>


More information about the cfe-dev mailing list