[cfe-commits] [PATCH] C++11 Range-based Loop Converter

Thu Aug 23 10:15:52 PDT 2012

  Next round up for review.

================
Comment at: loop-convert/LoopActions.cpp:228
@@ +227,3 @@
+
+/// \brief If the expression is a dereference or call to operator*(), return the
+/// operand. Otherwise, returns NULL.
----------------
Manuel Klimek wrote:
> s/return/returns/ for consistency.
Done.

================
Comment at: loop-convert/LoopActions.cpp:255
@@ +254,3 @@
+/// \brief Returns true when the index expression is a declaration reference to
+/// IndexVar and the array's base exists.
+///
----------------
Manuel Klimek wrote:
> The "and the array's base exists" part seems pretty arbitrary. 
This function also used to do extra checking on the array expression. Removed.

================
Comment at: loop-convert/LoopActions.cpp:299
@@ +298,3 @@
+  if (areSameExpr(Context, SourceExpr->IgnoreParenImpCasts(),
+                   Obj->IgnoreParenImpCasts()))
+    return true;
----------------
Manuel Klimek wrote:
> Indent.
Done.

================
Comment at: loop-convert/LoopActions.cpp:327
@@ +326,3 @@
+///
+/// Note that this is the
+/// only isValidXXX function that confirms that the opcode is correct, as there
----------------
Manuel Klimek wrote:
> Formatting.
Comment removed, as noted below.

================
Comment at: loop-convert/LoopActions.cpp:328-330
@@ +327,5 @@
+/// Note that this is the
+/// only isValidXXX function that confirms that the opcode is correct, as there
+/// is only one way to trigger this case (namely, the builtin operator*).
+/// For example, if the index variable is `index`, returns true for
+///   *index
----------------
Manuel Klimek wrote:
> I'm not sure I understand what you're getting at. This seems to be a hint at the overall strategy of the various isValid* functions (this is not named isValid* btw, so this is further confusing). 
The isValid* functions were supposed to abstract checking so that both overloaded and builtin operators could call the same function (e.g. isIndexInSubscriptExpr for operator[] and calls to at()). The abstraction wasn't worth it for isDereferenceOf{OpCall,Uop}, because the only common check is exprReferencesVariable().

In any case, I added a check to make sure that the opcode was correct to make the interface more uniform, and removed the now obsolete comment.

================
Comment at: loop-convert/LoopActions.cpp:354
@@ +353,3 @@
+///   }
+static bool isAliasDecl(const Decl *TheDecl, const VarDecl *TargetDecl) {
+  const VarDecl *VDecl = dyn_cast<VarDecl>(TheDecl);
----------------
Manuel Klimek wrote:
> TargetDecl is really confusing here. I thought you meant the target of the initialization, but that's TheDecl :)
> 
> Perhaps s/TargetDecl/IndexVar/. In general, naming those things consistently would help.
I must have missed that during the general rename. Done.

================
Comment at: loop-convert/LoopActions.cpp:412
@@ +411,3 @@
+
+/// Permitted usages here are dereferences of pointer types.
+///
----------------
Manuel Klimek wrote:
> This showing up as the overview comment for the method in doxygen seems strange...
Comment fixed in a later revision :)

================
Comment at: loop-convert/LoopActions.cpp:525
@@ +524,3 @@
+/// operator[].
+/// permitted in iterator loops, so we note them for conversion.
+///
----------------
Manuel Klimek wrote:
> Leftover sentence?
Yes, that doesn't belong. Fixed in later revision.

================
Comment at: loop-convert/LoopActions.cpp:715
@@ +714,3 @@
+
+/// \brief Determine whether VDecl appears to be an initializing an iterator.
+///
----------------
Manuel Klimek wrote:
> There is no parameter named VDecl...
Fixed in later revision. Also, renamed the function to more accurately reflect what it does (getContainerFromBeginEndCall).

I see what you meant about assuming that the comments are wrong :)

================
Comment at: loop-convert/LoopActions.cpp:748
@@ +747,3 @@
+  const Expr *BeginInitExpr = BeginVar->getInit();
+  const Expr *EndInitExpr = EndVar ? EndVar->getInit() : EndExpr;
+
----------------
Manuel Klimek wrote:
> Please document this in the function comment.
Done. In the process, the function was rearranged a bit to be simpler.

================
Comment at: loop-convert/LoopActions.cpp:759
@@ +758,3 @@
+      return NULL;
+  const Expr *EndSourceExpr = getContainerFromInitializer(EndInitExpr,
+                                                          /*IsBegin=*/false,
----------------
Manuel Klimek wrote:
> The asymmetry between the name EndSourceExpr and ContainerExpr is confusing.
> Perhaps BeginContainerExpr and EndContainerExpr?
Done.

================
Comment at: loop-convert/LoopActions.cpp:779
@@ +778,3 @@
+
+  if (!Context->getSourceManager().isFromMainFile(TheLoop->getForLoc()))
+    return;
----------------
Manuel Klimek wrote:
> Any reasons we don't want to make those transformations in headers?
> (The strategy I've used so far was regexps on the path)
This was the simplest safe solution I found for avoiding changes to files we weren't instructed to modify (especially system headers). I imagine this tool would conceptually be run on a portion of a project (e.g. clang/tools/extra/), and should only make changes within that project.

If there is a better mechanism (regexps on the path sound fine), I'd be happy to use it!

================
Comment at: loop-convert/LoopActions.cpp:774
@@ +773,3 @@
+/// matchers are convertible, printing information about the loops if so.
+void LoopFixer::run(const MatchFinder::MatchResult &Result) {
+  TranslationConfidenceKind ConfidenceLevel = TCK_Safe;
----------------
Manuel Klimek wrote:
> Really long function, would be great to find a way to split this up.
120 lines is a bit long.

The function has been split into two pieces, each of which is far more reasonable.

================
Comment at: loop-convert/LoopActions.cpp:818
@@ +817,3 @@
+  ForLoopIndexUseVisitor Finder(Context, LoopVar, EndVar, ContainerExpr,
+                           ContainerNeedsDereference);
+
----------------
Manuel Klimek wrote:
> Indent.
Done.

================
Comment at: loop-convert/LoopActions.cpp:858
@@ +857,3 @@
+  // implementation and rely on conflicting edit detection instead.
+  if (ReplacedVarRanges->count(TheLoop)) {
+    ++*DeferredChanges;
----------------
Manuel Klimek wrote:
> When I came to this condition, I wondered where we had already tried to create replacements in this run() call. I think that's a general problem here - there's too many different things going on on the same level.
The answer is that we haven't already tried to create replacements in this run() call! This condition detects replacements //across// calls of run() (the logic for which belongs in RefactoringTool or a higher level, I think).

The refactored arrangement of this function now splits the callback into three pieces:

# run() gathers the matched nodes, and makes sure the loop's header is correct.
# findAndVerifyUsages() prepares and runs a ForLoopIndexUseVisitor, then decides if it should make the changes
# doConversion() does the actual refactoring work

Does this seem more manageable?

================
Comment at: loop-convert/LoopConvert.cpp:81
@@ +80,3 @@
+  if (int result = SyntaxTool.run(
+      newFrontendActionFactory<clang::SyntaxOnlyAction>())) {
+    llvm::errs() << "Error compiling files.\n";
----------------
Manuel Klimek wrote:
> I think that indent is bad, as it looks like there's two things on the same level.
I think this is better.

================
Comment at: loop-convert/LoopMatchers.cpp:28
@@ +27,3 @@
+  return id(LoopName, forStmt(
+      hasLoopInit(declarationStatement(hasSingleDecl(id(InitVarName, variable(
+          hasInitializer(ignoringImpCasts(integerLiteral(equals(0))))))))),
----------------
Manuel Klimek wrote:
> I'd only use one binding concept (.bind()), as it makes it much easier to see at a glance what is being bound.
I tried to use id() only when the entire matcher being bound wasn't clearly visible at a glance. Especially for binding LoopName to the forStmt matcher, I 
dislike placing the name 15 lines away from the matcher. Some formatting changes made .bind() much easier to read, so I'm happy now.

================
Comment at: loop-convert/StmtAncestor.h:21-32
@@ +20,14 @@
+
+/// A map used to walk the AST in reverse: maps child Stmt to parent Stmt.
+typedef llvm::DenseMap<const Stmt*, const Stmt*> StmtParentMap;
+/// A map used to walk the AST in reverse:
+///  maps VarDecl to the to parent DeclStmt.
+typedef llvm::DenseMap<const VarDecl*, const DeclStmt*> DeclParentMap;
+/// A map used to track which variables have been removed by a refactoring pass.
+/// It maps the parent ForStmt to the removed index variable's VarDecl.
+typedef llvm::DenseMap<const ForStmt*, const VarDecl *> ReplacedVarsMap;
+/// A map used to remember the variable names generated in a Stmt
+typedef llvm::DenseMap<const Stmt*, std::string> StmtGeneratedVarNameMap;
+/// A vector used to store the AST subtrees of an Expr.
+typedef llvm::SmallVector<const Expr *, 16> ComponentVector;
+
----------------
Manuel Klimek wrote:
> I think adding some newlines would make this easier to read.
Done.

================
Comment at: loop-convert/StmtAncestor.h:47
@@ +46,3 @@
+  /// work unless RunEvenIfNotEmpty is set to true.
+  void gatherAncestors(const TranslationUnitDecl *TUD, bool RunEvenIfNotEmpty) {
+    if (RunEvenIfNotEmpty || StmtAncestors.empty()) {
----------------
Manuel Klimek wrote:
> Seems to be only called once - why the bool parameter?
That was left as a hook in case some way to update the AST (after making a refactoring change) appeared. I've removed the extra parameter it for now.

================
Comment at: loop-convert/StmtAncestor.h:48
@@ +47,3 @@
+  void gatherAncestors(const TranslationUnitDecl *TUD, bool RunEvenIfNotEmpty) {
+    if (RunEvenIfNotEmpty || StmtAncestors.empty()) {
+      TraverseDecl(const_cast<TranslationUnitDecl *>(TUD));
----------------
Manuel Klimek wrote:
> Can this ever be called twice on the same object? why not assert(StmtAncestors.empty())?
It could be called twice on the same object; I just don't do that right now. In fact, if we run this across multiple TUs, then we will have to call gatherAncestors() once per TU.

================
Comment at: loop-convert/StmtAncestor.h:115
@@ +114,3 @@
+  /// some variable declared within ContainingStmt.
+  bool dependsOnOutsideVariable(const Stmt *Body) {
+    DependsOnOutsideVariable = false;
----------------
Manuel Klimek wrote:
> This one especially could need an example or two in the docs :) I find this to be one of the most complex parts of this code.
I added an example that (I think!) makes it clear what this is trying to protect against.

================
Comment at: loop-convert/VariableNaming.h:32-35
@@ +31,6 @@
+ public:
+  VariableNamer(ASTContext *Context, StmtGeneratedVarNameMap *GeneratedDecls,
+                const StmtParentMap *ReverseAST, const DeclContext *LoopContext,
+                const Stmt *SourceStmt, const VarDecl *OldIndex,
+                const VarDecl *TheContainer) :
+  Context(Context), GeneratedDecls(GeneratedDecls), ReverseAST(ReverseAST),
----------------
Manuel Klimek wrote:
> I'm wondering whether there's an abstraction missing for that kind of argument list, but I cannot come up with it currently. Something to keep in the back of our minds.
I've been considering that too. VariableNamer essentially takes a few naming hints (TheContainer, OldIndex), and some parameters which let it decide if a name was already taken (GeneratedDecls, ReverseAST, LoopContext, SourceStmt). The separation of responsibilities happens internally, though I did not feel creating more complexity because this is a function run once on a single object.

VariableNamer class pretty much exists as a place to put all the naming worries so that they doesn't clutter up LoopFixer. I'm open to ideas...

================
Comment at: loop-convert/VariableNaming.cpp:22-23
@@ +21,4 @@
+
+  // FIXME: Maybe create a class so that this call doesn't need 6 parameters
+  // every time?
+  if (!declarationExists(IteratorName))
----------------
Manuel Klimek wrote:
> Is that comment outdated?
What gave that away? :)

================
Comment at: loop-convert/VariableNaming.cpp:52
@@ +51,3 @@
+/// converter in a loop nested within SourceStmt.
+bool VariableNamer::declarationExists(const std::string& Symbol) {
+  IdentifierInfo& Identifier = Context->Idents.get(Symbol);
----------------
Manuel Klimek wrote:
> Any reason not to use StringRef?
No reason at all. StringRef it is.

================
Comment at: loop-convert/VariableNaming.cpp:57-68
@@ +56,14 @@
+
+  // First, let's check the parent context.
+  // FIXME: lookup() always returns the pair (NULL, NULL) because its
+  // StoredDeclsMap is not initialized (i.e. LookupPtr.getInt() is false inside
+  // of DeclContext::lookup()). Why is this?
+  // NOTE: We work around this by checking when a shadowed declaration is
+  // referenced, rather than now.
+  for (const DeclContext *CurrContext = LoopContext; CurrContext != NULL;
+       CurrContext = CurrContext->getLookupParent()) {
+    DeclContext::lookup_const_result Result = CurrContext->lookup(Name);
+    if (Result.first != Result.second)
+        return true;
+  }
+
----------------
Manuel Klimek wrote:
> I think I mentioned that in one of the pre-phab review mails, but I'm not sure any more :) So, is this now used, or not?
No, it's not used...though it would be wonderful if someone who knows how DeclContexts work could explain what's going wrong.

Removed, along with all the parameters that were only involved in the ineffective code.

================
Comment at: loop-convert/LoopActions.cpp:25
@@ +24,3 @@
+  /// what needs to be replaced.
+  SourceRange Range;
+};
----------------
Manuel Klimek wrote:
> In general, this seems like a design smell - can we instead always use Range, and just set it to E->getSourceRange when IsArrow is false? Perhaps even via a constructor, so this is impossible to use incorrectly?
Done. The nearby code is now a bit nicer too.

================
Comment at: loop-convert/LoopMatchers.h:34
@@ +33,3 @@
+ast_matchers::StatementMatcher makeIteratorLoopMatcher();
+ast_matchers::StatementMatcher makePseudoArrayLoopMatcher();
+} //namespace loop_migrate
----------------
Manuel Klimek wrote:
> Please insert newline.
Done.

================
Comment at: loop-convert/LoopMatchers.cpp:22
@@ +21,3 @@
+  static StatementMatcher LHSMatcher =
+  expression(ignoringImpCasts(declarationReference(to(
+      variable(hasType(isInteger())).bind(ConditionVarName)))));
----------------
Manuel Klimek wrote:
> +1 on that they shouldn't be static. Also, indent.
Done. What's the opinion on static matchers outside of the factory functions to increase sharing between the factory functions?

================
Comment at: loop-convert/VariableNaming.h:23
@@ +22,3 @@
+namespace loop_migrate {
+/// \brief VariableNamer - Create names for generated variables within
+/// a particular statement.
----------------
Manuel Klimek wrote:
> Also, I'd either leave out the VariableNamer here, or create a sentence like: VariableNamer creates names...
Redundant name removed.

================
Comment at: loop-convert/VariableNaming.h:22
@@ +21,3 @@
+namespace clang {
+namespace loop_migrate {
+/// \brief VariableNamer - Create names for generated variables within
----------------
Manuel Klimek wrote:
> I'd add a whitespace for readability. In general, I find comments starting directly after something else without an empty line to be hard on the eye.
Agreed; whitespace added.

http://llvm-reviews.chandlerc.com/D19