[cfe-dev] For loop conversion (was: C++11 migration tools)
Sam Panzer
panzer at google.com
Wed Jul 11 11:08:32 PDT 2012
Good point. I currently have a matcher that finds for loops that appear to
have the right shape for conversion, and I do some extra checking along
with the conversion work in the callback. I can completely handle
array-style loops with a Preprocessor, which is used when generating the
names of new variables. For iterator-style loops, I use Sema to determine
the return type of operator*, which will be the type of the loop variable.
Finally (though this isn't implemented yet), containers that are used as if
they're arrays will require Sema to check if suitable begin() and end()
functions exist.
For now, I can force the use of auto rather than an explicit type listing
for iterator-based loops and get most of the work done without Sema. The
question then becomes the correct way to avoid repeatedly creating a
Preprocessor object, though if this is a cheap operation, I can create one
in the callback.
Does this make the use clearer?
Thanks!
On Wed, Jul 11, 2012 at 12:29 AM, Manuel Klimek <klimek at google.com> wrote:
> On Wed, Jul 11, 2012 at 6:03 AM, Sam Panzer <panzer at google.com> wrote:
> > I'm trying to port my code to take advantage of matchers, now that they
> are
> > in mainline. Some of the work I want to do involves semantic analysis of
> the
> > results (i.e. in the callback). What would be the best way to get a Sema
> or
> > CompilerInstance out of either RefactoringTool or MatchResult? I'm
> currently
> > playing with changing MatchASTConsumer to inherit from SemaConsumer, so
> that
> > MatchFinder can track a Sema object the same way it does an ASTContext.
>
> I'd be interested in what the use cases for the semantic analysis are
> first. Often it seems like it would be better to have the information
> available in the AST instead of rerunning (potentially expensive)
> semantic analysis.
>
> Cheers,
> /Manuel
>
> >
> > Thanks!
> >
> >
> > On Mon, Jul 2, 2012 at 7:22 AM, Manuel Klimek <klimek at google.com> wrote:
> >>
> >> On Mon, Jul 2, 2012 at 4:16 PM, Sam Panzer <panzer at google.com> wrote:
> >>>
> >>> On Sun, Jul 1, 2012 at 10:45 PM, Manuel Klimek <klimek at google.com>
> wrote:
> >>>>
> >>>> On Fri, Jun 29, 2012 at 8:17 PM, Sam Panzer <panzer at google.com>
> wrote:
> >>>>>
> >>>>> Thanks for the input!
> >>>>>
> >>>>> Tooling/Refactoring is definitely the right way to go - dumping to
> >>>>> stdout was just a holdover from the example on LibTooling. I'll
> change it
> >>>>> once I figure out how it works - and a clean way to arrange the
> tests.
> >>>>>
> >>>>> As for the use of matchers vs. visitors, I decided to use a visitor
> >>>>> because this is a relatively complex transformation. I would happily
> use
> >>>>> matchers if I thought I could - and I think that some other c++11
> migrations
> >>>>> can easily be written with matchers - but I think the for loop
> >>>>
> >>>>
> >>>> I'm not claiming that the matchers needed to match those constructs
> are
> >>>> all already written - but if we write the questions you need to ask
> into
> >>>> matchers, other people who want to match similar things can reuse
> them, thus
> >>>> amplifying the impact of the code you write ;)
> >>>>
> >>>>>
> >>>>> checks need some features that matchers don't have (correct me if I'm
> >>>>> wrong!). For example, the check for statically allocated array-based
> loops
> >>>>> does this:
> >>>>> Given a for loop, determine if:
> >>>>> - The increment statement increments (via ++) exactly one integral
> >>>>> index variable, such that the variable is declared and initialized
> to zero
> >>>>> in the init portion of the loop, and that this variable's value is a
> >>>>> compared (via <, > or !=) to a nonnegative compile-time constant N
> in the
> >>>>> compare portion.
> >>>>> - The index variable is only ever used in an ArrayIndexExpession
> >>>>> indexing a single, statically allocated array A.
> >>>>> - The array A has exactly N elements.
> >>>>> - Additionally, if the ArrayIndexExpression A[index] is ever
> assigned,
> >>>>> passed to a function or copied as a non-const reference, or its
> address
> >>>>> taken with & (I still need to add a check for calls to non-const
> member
> >>>>> functions), the loop variable in the converted version needs to be a
> >>>>> non-const reference so that the value will be correctly updated
> (this step
> >>>>> adds the most complexity).
> >>>>
> >>>>
> >>>> ... and the matcher I would want to write for this looks something
> like
> >>>> that:
> >>>> ForLoop(
> >>>> HasInitialization(Declaration(Id("loopvar",
> HasType(IsIntegral())))),
> >>>> HasCondition(BinaryOperator(
> >>>> HasAnyOperand(DeclarationReference(Id("condref",
> To(Variable())))),
> >>>> HasAnyOperand(IntegerLiteral()))),
> >>>>
> >>>>
> HasIncrement(UnaryOperator(HasUnaryOperand(DeclarationReference(Id("incref",
> >>>> To(Variable()))))), ...),
> >>>> )
> >>>>
> >>>> In general, the complex stuff can stay complex, but the simple stuff
> >>>> shouldn't be lots of code.
> >>>
> >>>
> >>> Good point - and this is much easier to read than the equivalent code I
> >>> had written.
> >>>
> >>>>
> >>>>
> >>>>>
> >>>>> The other types of loop (iterator-based, and array-like container)
> are
> >>>>> more complicated to detect, since there are more permitted ways to
> define
> >>>>> and use the index/iterators. What makes this difficult to do
> entirely with
> >>>>> matchers is the number of back- and cross-references, as well as the
> >>>>> different local behaviors based on semantic properties. On the other
> hand,
> >>>>> if there were some kind of backreference-enabled matcher that
> >>>>
> >>>>
> >>>> The way to handle the back-refs is to bind the nodes you want, and
> then
> >>>> pull them out and compare them in the callback.
> >>>
> >>>
> >>> I see - make the matcher slightly more general, then filter the
> results,
> >>> perhaps with a visitor.
> >>>
> >>>>
> >>>>
> >>>> Thoughts?
> >>>> /Manuel
> >>>
> >>>
> >>> This sounds like it would make at least half the work much easier, so I
> >>> think it would definitely be worth it to try switching to a
> matcher-based
> >>> solution. When are matchers supposed to hit mainline (or some extra
> >>> cloneable repo) :) ?
> >>
> >>
> >> Matchers are currently in
> >> ^cfe/branches/tooling/include/clang/ASTMatchers/...
> >>
> >> I'm currently working on renaming them to camelCase from CamelCase;
> >> there's a Tool to help with the conversion though, so no problem in
> starting
> >> now ...
> >>
> >> Cheers,
> >> /Manuel
> >>
> >>>
> >>>
> >>> -Sam
> >>>
> >>>>
> >>>>
> >>>>>
> >>>>> allowed me to locate all matches in a given Stmt, it could be *much*
> >>>>> easier to express some parts of the logic, such as the first step in
> the
> >>>>> above list. I also suspect that a single-Stmt matcher would better
> way to
> >>>>> handle the last step; currently I track whether the visitor is
> looking at a
> >>>>> statement or expression which fits any of the const-disqualifying
> >>>>> conditions, and a note is made if I run into A[index].
> >>>>>
> >>>>> Does this make the use case clearer? I don't really see a better way
> to
> >>>>> approach this problem, but you guys know the available toolkit far
> better
> >>>>> than I do.
> >>>>>
> >>>>> On Fri, Jun 29, 2012 at 2:48 AM, Manuel Klimek <klimek at google.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> On Fri, Jun 29, 2012 at 11:45 AM, Chandler Carruth
> >>>>>> <chandlerc at google.com> wrote:
> >>>>>>>
> >>>>>>> I tend to agree that this should use the Tooling/Refactoring stuff.
> >>>>>>>
> >>>>>>> However, I'm curious about the best way to structure the location
> of
> >>>>>>> migration candidates: AST matchers vs. visitor.
> >>>>>>>
> >>>>>>> I can almost see the visitor pattern working really well here as
> each
> >>>>>>> different construct can have a pile of migration logic dropped
> in.... But if
> >>>>>>> there is a need to connect dots between more distant constructs,
> that
> >>>>>>> wouldn't work so well.... Not at all sure what would be best here.
> >>>>>>
> >>>>>>
> >>>>>> I've used a combination before - use matchers for the stuff where
> they
> >>>>>> work well, then write a very small easy-to-understand visitor if
> you need
> >>>>>> more. I think that brings down code size by quite a bit - obviously
> just a
> >>>>>> gut feeling here.
> >>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Jun 29, 2012 at 1:37 AM, Manuel Klimek <klimek at google.com>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> On Fri, Jun 29, 2012 at 4:06 AM, Sam Panzer <panzer at google.com>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> In case anyone wanted to take a look, the attached patch includes
> >>>>>>>>> the tool I've been working on. I create a new binary,
> c++migrate, which
> >>>>>>>>> attempts to convert for loops in the source files given to it.
> Most of my
> >>>>>>>>> focus has been on the FrontedAction, so I skirted all of the
> issues
> >>>>>>>>> mentioned above by keeping the frontend interaction minimal
> (i.e. I just
> >>>>>>>>> call Tooling::ClangTool::run), and the changes are just reported
> on standard
> >>>>>>>>> output, if there are any to be made.
> >>>>>>>>>
> >>>>>>>>> The tool can currently convert for loops that range over (1)
> >>>>>>>>> statically allocated arrays, and (2) Clang-style iterator-based
> loops (with
> >>>>>>>>> begin and end iterators defined). All loop variables need to be
> declared
> >>>>>>>>> within the loop's initialization step in order for it to be
> converted,
> >>>>>>>>> though this requirement can potentially be eliminated. I'm
> working on
> >>>>>>>>> converting iterator-based loops that call someContainer.end() on
> each
> >>>>>>>>> iteration, since they're probably the common case in many
> codebases.
> >>>>>>>>>
> >>>>>>>>> Just for fun, I ran the tool over the 41 .cpp files in lib/Sema,
> >>>>>>>>> and my tool found 71 convertible loops in 17 files. There is
> plenty more
> >>>>>>>>> work to go, because it clearly missed some easy ones.
> >>>>>>>>>
> >>>>>>>>> Any input or feedback is welcome!
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> High-level observations:
> >>>>>>>> 1. the handling of the rewrites; any reason not to use the
> >>>>>>>> Tooling/Refactoring stuff? Currently in the patch it looks to me
> like the
> >>>>>>>> files are not rewritten, but dumped to stdout
> >>>>>>>> 2. is the reason not to use the matchers here that they're not
> >>>>>>>> landed in mainline yet?
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> /Manuel
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> -Sam
> >>>>>>>>>
> >>>>>>>>> On Thu, Jun 28, 2012 at 10:50 AM, Sam Panzer <panzer at google.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> I'm that intern :)
> >>>>>>>>>>
> >>>>>>>>>> -Sam
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Jun 27, 2012 at 9:48 PM, John Wiegley <
> johnw at boostpro.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> >>>>> Sam Panzer <panzer at google.com> writes:
> >>>>>>>>>>>
> >>>>>>>>>>> > In particular, I am working on a tool to convert existing C++
> >>>>>>>>>>> > for loops to
> >>>>>>>>>>> > take advantage of the new C++11 range-based syntax. I can
> >>>>>>>>>>> > imagine similar
> >>>>>>>>>>> > tools to replace const with constexpr, macro hacks with
> >>>>>>>>>>> > static_assert, and
> >>>>>>>>>>> > potentially other common refactorings.
> >>>>>>>>>>>
> >>>>>>>>>>> > Thoughts? Suggestions?
> >>>>>>>>>>>
> >>>>>>>>>>> You really must watch this presentation, if you haven't
> already:
> >>>>>>>>>>>
> >>>>>>>>>>> http://www.youtube.com/watch?v=yuIOGfcOH0k
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> John Wiegley
> >>>>>>>>>>> BoostPro Computing
> >>>>>>>>>>> http://www.boostpro.com
> >>>>>>>>>>> _______________________________________________
> >>>>>>>>>>> cfe-dev mailing list
> >>>>>>>>>>> cfe-dev at cs.uiuc.edu
> >>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> cfe-dev mailing list
> >>>>>>>>> cfe-dev at cs.uiuc.edu
> >>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> cfe-dev mailing list
> >>>>>>>> cfe-dev at cs.uiuc.edu
> >>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
> >
> > _______________________________________________
> > cfe-dev mailing list
> > cfe-dev at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120711/156497c7/attachment.html>
More information about the cfe-dev
mailing list