[PATCH] D23738: [PM] Extend the explicit 'invalidate' method API on analysis results to accept an Invalidator that allows them to invalidate themselves if their dependencies are in turn invalidated.

Chandler Carruth via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 22 14:39:07 PST 2016


chandlerc marked 27 inline comments as done.
chandlerc added a comment.

Whew, finally through this. Thanks so much Justin for the review. Responses to a few issues inline, and an updated patch on its way.



================
Comment at: include/llvm/IR/PassManager.h:369
+
+  /// \brief Map type from function pointer to our custom list type.
+  typedef DenseMap<IRUnitT *, AnalysisResultListT> AnalysisResultListMapT;
----------------
jlebar wrote:
> An IRUnitT is not necessarily a function pointer, nor a pointer to an llvm::Function, right?
Just a stale comment.


================
Comment at: include/llvm/IR/PassManager.h:373
+  /// \brief Map type from a pair of analysis ID and function pointer to an
+  /// iterator into a particular result list.
+  typedef DenseMap<std::pair<void *, IRUnitT *>,
----------------
jlebar wrote:
> This comment in particular, but also the one above, just sort of spells out in English what the type is.  It's definitely helpful to tell me what "void*" is, but you could write that as
> 
>   std::pair</* AnalysisID */ void*, IRUnit *>
> 
> or even
> 
>   using AnalysisID = void*;
>   std::pair<AnalysisID, IRUnit *>
> 
> Once we've made the code clearer in this way, perhaps we could use the comments to say something about what these types are for.
True, the comment here is just trying to explain the type itself. The members below that use them I think have more helpful comments?

I could try to clean up the type names with aliases and remove the comments if you like. But maybe in a follow-up patch? It seems orthogonal to what this is trying to do.

I've cleaned up the comments very minorly here, but not in any way that substantively addresses this.


================
Comment at: include/llvm/IR/PassManager.h:386
+  /// results themselves define the dependency graph rather than tracking and
+  /// hard coding it anywhere.
+  class Invalidator {
----------------
jlebar wrote:
> Rephrase starting with "rather than".  I think this is important, because it's kind of the motivation for the whole design, but I don't grok what you're trying to get across.
I've tried to clarify this. Keep complaining until it makes sense though.


================
Comment at: include/llvm/IR/PassManager.h:596
+      // on the type erased result where it doesn't, but we don't have to look
+      // up the result in the map where it does.
+      void *PassID = AnalysisResultPair.first;
----------------
jlebar wrote:
> again s/where/whereas/ (or "while") would help, I think.
> 
> Or maybe just
> 
>   This is basically the same thing as Invalidator::invalidate, but we can't call it here because we're operating on the type-erased result.  Moreover if we instead called invalidate() directly, it would do an unnecessary lookup in ResultsList.
> 
> Or maybe even leave out that second sentence, because it's a premature optimization.  :)
I like your wording much better than mine. Stolen.

And I still want to document that we're dodging a map lookup -- if we go back, the author should be aware of this even if it is worth the tradeoff.


================
Comment at: include/llvm/IR/PassManager.h:610
+      IMapI->second = Result.invalidate(IR, PA, Inv);
+    }
+
----------------
silvas wrote:
> jlebar wrote:
> > WRT the names:
> > 
> > I agree "invalidate" isn't a great name, because
> > 
> >   if (!my_pass->invalidate(...)) {
> >     // my_pass was not invalidated!
> >   }
> > 
> > `checkIfInvalidated`?  `onIRChange`?  `notifyIRChanged`?  I like the first one because the bool return value sort of makes sense.
> > 
> > This would also change the comment above -- we're not really "trying" to invalidate the result.  More informing it that something has changed, and giving it the option to mark itself as invalid.
> > 
> > I still don't really understand what we're supposed to do inside the `invalidate` function, so I'm not sure about the `Invalidator` class name.  The only example I have is
> > 
> >     bool TestIndirectFunctionAnalysis::invalidate(Function &F, const PreservedAnalyses &PA,
> >                     FunctionAnalysisManager::Invalidator &Inv) {
> >       return !PA.preserved<TestIndirectFunctionAnalysis>() ||
> >              Inv.invalidate<TestFunctionAnalysis>(F, PA);
> >     }
> > 
> > It naively seems to me that it would be better to structure things so that we never run this function if PA.preserved<TestIndirectFunctionAnalysis>().  The Invalidator could ensure this.  Then our function could be
> > 
> >   bool checkIfInvalidated(...) {
> >     return Inv.checkIfInvalidated<Dep1>() || Inv.checkIfInvalidated<Dep2>();
> >   }
> > 
> > which actually makes sense to me.  I think ideally Inv.checkIfInvalidated would take zero parameters, since it's just boilerplate, but maybe it should still take the Function/Module/whatever.  We'd at least be able to get rid of the PreservedAnalyses argument, I think?
> Currently, the `invalidate` method is a way to opt out of being invalidated (e.g. for TTI which knows it only depends on the target).
> 
> However, this cannot trigger invalidation of additional analysis result objects, which is what is needed when one analysis result holds pointers to another analysis results in order to avoid use-after free errors[1]. Hence the addition of the extra invalidation machinery in this patch. (this patch is by no means a comprehensive solution; I mentioned some problem areas in the initial comment in this review). 
> 
> The `Invalidator` that is added in this patch basically provides a constrained interface to the analysis result list so that analysis result objects can transitively find and check whether any analysis result objects that they hold a pointer to have been invalidated. This allows invalidation to propagate from dependencies to dependents and trigger additional invalidation, avoiding use-after-free. (for layering reasons, analyses cannot know what will depend on them, but only what they depend on, so to do this with object recursion requires this process to be top-down dependent-to-dependency relying on memoization, rather than bottom-up dependency-to-dependent with a direct traversal)
> 
> 
> Also, unfortunately, it is not entirely practical to avoid calling the `invalidate` method when `PA.preserved<FooAnalysis>()` is true. The difficulty again occurs when analysis result objects for FooAnalysis hold pointers to other analysis result objects. In that case, transformation passes would have to call `PA.preserve<BarAnalysis>()` for every analysis that FooAnalysis result objects transitively hold a pointer to (otherwise there would be a dangling pointer / use-after-free error when the BarAnalysis result object is invalidated due to not being in the "preserved" set). There is actually an additional difficulty due to `AliasAnalysis`, which depends on a dynamic (but fixed for a given pass pipeline invocation) set of type-erased analyses; hence there is no static set of `PA.preserve<...>()` calls which is necessarily correct.
> 
> I don't think that the `PreservedAnalyses` argument can be eliminated, because the `!PA.preserved<FooAnalysis>()` check is essentially the base-case of the object recursion facilitated by the `Invalidator`. I.e. it determines the roots from which invalidation must be propagated from dependency to dependent (although for layering reasons the object recursion goes from dependent to dependency, with memoization to keep the complexity under control).
> 
> [1] This is actually pervasive. Running any non-trivial pipeline on any non-trivial piece of code will currently trigger numerous use-after free errors with the new pass manager.
First and foremost, I'm down with renaming this, but I'd really like that to be a separate patch. It'll be a ton of boring changes that I don't want to try and juggle in this commit.

Second, regarding the preserved analysis argument and the IR unit argument: I think they both have good reasons to exist.
1) Transformations may want to preserve an entire set of analyses, and analyses should query in the preserved analyses to see if a set that covers them has been preserved even if the individual analyses has not been.
2) Analyses may want to check trivial information about the IR to determine from first principles that they have been preserved.

And as Sean points out, this is really just about adding the first of several pieces of infrastructure necessary to make invalidation reliable.

I'd like to brainstorm different names for this outside the review though to see if there is a clearly superior alternative.


================
Comment at: include/llvm/IR/PassManagerInternal.h:99
   ///
   /// \returns true if the result is indeed invalid (the default).
+  virtual bool invalidate(IRUnitT &IR, const PreservedAnalysesT &PA,
----------------
jlebar wrote:
> Based on this comment I have no idea what I am supposed to do with the InvalidatorT object.
Yea, I failed to update this. Fixed.


================
Comment at: lib/Analysis/CGSCCPassManager.cpp:79
+  // AnalysisManager are preserved. We mark this with a set so that we don't
+  // need to inspect each one individually.
+  PA.preserve<AllAnalysesOn<LazyCallGraph::SCC>>();
----------------
jlebar wrote:
> I don't understand this comment, particularly "with a set".
Tried better words.


================
Comment at: unittests/IR/PassManagerTest.cpp:458
+  // another function invalidate the indirect analysis, and for the final
+  // function invalidate both.
+  FPM.addPass(LambdaPass([&](Function &F, FunctionAnalysisManager &AM) {
----------------
jlebar wrote:
> The "for one function invalidate" form is awkward for me.  Maybe this would be easier to read as a bulleted list:
> 
>   Next, invalidate
>     - the indirect analysis for g(),
>     - the underlying analysis for h(), and
>     - both analyses for f().
> 
> It might also make sense to rejigger the code so that we can change the comment to read "f()", "g()", and "h()".
Not sure what rejiggering you had in mind here, but the comment seems sane now?


https://reviews.llvm.org/D23738





More information about the llvm-commits mailing list