[cfe-dev] For loop conversion (was: C++11 migration tools)

Manuel Klimek klimek at google.com
Sun Jul 1 22:45:26 PDT 2012


On Fri, Jun 29, 2012 at 8:17 PM, Sam Panzer <panzer at google.com> wrote:

> Thanks for the input!
>
> Tooling/Refactoring is definitely the right way to go - dumping to stdout
> was just a holdover from the example on LibTooling. I'll change it once I
> figure out how it works - and a clean way to arrange the tests.
>
> As for the use of matchers vs. visitors, I decided to use a visitor
> because this is a relatively complex transformation. I would happily use
> matchers if I thought I could - and I think that some other c++11
> migrations can easily be written with matchers - but I think the for loop
>

I'm not claiming that the matchers needed to match those constructs are all
already written - but if we write the questions you need to ask into
matchers, other people who want to match similar things can reuse them,
thus amplifying the impact of the code you write ;)


> checks need some features that matchers don't have (correct me if I'm
> wrong!). For example, the check for statically allocated array-based loops
> does this:
> Given a for loop, determine if:
>  - The increment statement increments (via ++) exactly one integral index
> variable, such that the variable is declared and initialized to zero in the
> init portion of the loop, and that this variable's value is a compared (via
> <, > or !=) to a nonnegative compile-time constant N in the compare portion.
>  - The index variable is only ever used in an ArrayIndexExpession indexing
> a single, statically allocated array A.
>  - The array A has exactly N elements.
>  - Additionally, if the ArrayIndexExpression A[index] is ever assigned,
> passed to a function or copied as a non-const reference, or its address
> taken with & (I still need to add a check for calls to non-const member
> functions), the loop variable in the converted version needs to be a
> non-const reference so that the value will be correctly updated (this step
> adds the most complexity).
>

... and the matcher I would want to write for this looks something like
that:
ForLoop(
  HasInitialization(Declaration(Id("loopvar", HasType(IsIntegral())))),
  HasCondition(BinaryOperator(
    HasAnyOperand(DeclarationReference(Id("condref", To(Variable())))),
    HasAnyOperand(IntegerLiteral()))),

HasIncrement(UnaryOperator(HasUnaryOperand(DeclarationReference(Id("incref",
To(Variable()))))), ...),
)

In general, the complex stuff can stay complex, but the simple stuff
shouldn't be lots of code.


> The other types of loop (iterator-based, and array-like container) are
> more complicated to detect, since there are more permitted ways to define
> and use the index/iterators. What makes this difficult to do entirely with
> matchers is the number of back- and cross-references, as well as the
> different local behaviors based on semantic properties. On the other hand,
> if there were some kind of backreference-enabled matcher that
>

The way to handle the back-refs is to bind the nodes you want, and then
pull them out and compare them in the callback.

Thoughts?
/Manuel


> allowed me to locate all matches in a given Stmt, it could be *much*
> easier to express some parts of the logic, such as the first step in the
> above list. I also suspect that a single-Stmt matcher would better way to
> handle the last step; currently I track  whether the visitor is looking at
> a statement or expression which fits any of the const-disqualifying
> conditions, and a note is made if I run into A[index].
>
> Does this make the use case clearer? I don't really see a better way to
> approach this problem, but you guys know the available toolkit far better
> than I do.
>
> On Fri, Jun 29, 2012 at 2:48 AM, Manuel Klimek <klimek at google.com> wrote:
>
>> On Fri, Jun 29, 2012 at 11:45 AM, Chandler Carruth <chandlerc at google.com>wrote:
>>
>>> I tend to agree that this should use the Tooling/Refactoring stuff.
>>>
>>> However, I'm curious about the best way to structure the location of
>>> migration candidates: AST matchers vs. visitor.
>>>
>>> I can almost see the visitor pattern working really well here as each
>>> different construct can have a pile of migration logic dropped in.... But
>>> if there is a need to connect dots between more distant constructs, that
>>> wouldn't work so well.... Not at all sure what would be best here.
>>>
>>
>> I've used a combination before - use matchers for the stuff where they
>> work well, then write a very small easy-to-understand visitor if you need
>> more. I think that brings down code size by quite a bit - obviously just a
>> gut feeling here.
>>
>>
>>>
>>>
>>> On Fri, Jun 29, 2012 at 1:37 AM, Manuel Klimek <klimek at google.com>wrote:
>>>
>>>> On Fri, Jun 29, 2012 at 4:06 AM, Sam Panzer <panzer at google.com> wrote:
>>>>
>>>>> In case anyone wanted to take a look, the attached patch includes the
>>>>> tool I've been working on. I create a new binary, c++migrate, which
>>>>> attempts to convert for loops in the source files given to it. Most of my
>>>>> focus has been on the FrontedAction, so I skirted all of the issues
>>>>> mentioned above by keeping the frontend interaction minimal (i.e. I just
>>>>> call Tooling::ClangTool::run), and the changes are just reported on
>>>>> standard output, if there are any to be made.
>>>>>
>>>>> The tool can currently convert for loops that range over (1)
>>>>> statically allocated arrays, and (2) Clang-style iterator-based loops (with
>>>>> begin and end iterators defined). All loop variables need to be declared
>>>>> within the loop's initialization step in order for it to be converted,
>>>>> though this requirement can potentially be eliminated. I'm working on
>>>>> converting iterator-based loops that call someContainer.end() on each
>>>>> iteration, since they're probably the common case in many codebases.
>>>>>
>>>>> Just for fun, I ran the tool over the 41 .cpp files in lib/Sema, and
>>>>> my tool found 71 convertible loops in 17 files. There is plenty more work
>>>>> to go, because it clearly missed some easy ones.
>>>>>
>>>>> Any input or feedback is welcome!
>>>>>
>>>>
>>>> High-level observations:
>>>> 1. the handling of the rewrites; any reason not to use the
>>>> Tooling/Refactoring stuff? Currently in the patch it looks to me like the
>>>> files are not rewritten, but dumped to stdout
>>>> 2. is the reason not to use the matchers here that they're not landed
>>>> in mainline yet?
>>>>
>>>> Cheers,
>>>> /Manuel
>>>>
>>>>
>>>>>
>>>>> -Sam
>>>>>
>>>>> On Thu, Jun 28, 2012 at 10:50 AM, Sam Panzer <panzer at google.com>wrote:
>>>>>
>>>>>> I'm that intern :)
>>>>>>
>>>>>> -Sam
>>>>>>
>>>>>>
>>>>>> On Wed, Jun 27, 2012 at 9:48 PM, John Wiegley <johnw at boostpro.com>wrote:
>>>>>>
>>>>>>> >>>>> Sam Panzer <panzer at google.com> writes:
>>>>>>>
>>>>>>> > In particular, I am working on a tool to convert existing C++ for
>>>>>>> loops to
>>>>>>> > take advantage of the new C++11 range-based syntax. I can imagine
>>>>>>> similar
>>>>>>> > tools to replace const with constexpr, macro hacks with
>>>>>>> static_assert, and
>>>>>>> > potentially other common refactorings.
>>>>>>>
>>>>>>> > Thoughts? Suggestions?
>>>>>>>
>>>>>>> You really must watch this presentation, if you haven't already:
>>>>>>>
>>>>>>>     http://www.youtube.com/watch?v=yuIOGfcOH0k
>>>>>>>
>>>>>>> --
>>>>>>> John Wiegley
>>>>>>> BoostPro Computing
>>>>>>> http://www.boostpro.com
>>>>>>> _______________________________________________
>>>>>>> cfe-dev mailing list
>>>>>>> cfe-dev at cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> cfe-dev mailing list
>>>>> cfe-dev at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120702/6bce112f/attachment.html>


More information about the cfe-dev mailing list