[PATCH] D15124: Use @llvm.invariant.start/end intrinsics to extend basic AA with invariant range analysis for GVN-based load elimination purposes [Local objects only]

Fri Jan 15 04:40:04 PST 2016

lvoufo marked 15 inline comments as done.
lvoufo added a comment.

I hope I addressed all the comments. Let me know if I missed anything.


================
Comment at: include/llvm/Analysis/InvariantInfo.h:71
@@ +70,3 @@
+  /// \brief Sets the dominator and post-dominator trees.
+  void init(DominatorTree *DomTree, PostDominatorTreeT *PostDomTree) {
+    // If the dominator and postdominator trees have changed, then
----------------
chandlerc wrote:
> Using an init function is not terribly common in LLVM. Why not use a constructor?
Perhaps `set` makes more sense here than `init`.

`InvariantInfo` serves two purposes. First, it tracks all invariant intrinsics in the input program. Second, it performs invariant region analysis *on demand*, from basic AA and GVN, based on tracked intrinsics. The region analysis requires dominance info, but tracking intrinsics, e.g. to print out or verify invariant info (via the passes `InvariantInfoPrinterPass` or `InvariantInfoVerifierPass`), does not need dominance info. So, it does not make much sense to require dominance info in the constructor of `InvariantInfo`. 

Besides, requiring dominance info in the constructor does not work very well with the pass manager system(s). For example, look at the uses of `InvariantInfo` in the passes `InvariantInfoAnalysis` and `InvariantRangeAnalysisPass`. There is no guarantee that `DominatorTree` or `PostDominatorTreeT` instances will be created (and present) when these passes are instantiated.  Even if the instances are present, there is no guarantee that they would be the instances that the range analysis needs to be run with.

Ideally, instead of introducing `init()` here, one would make both `InvariantRangeAnalysisPass` and `PostDominatorTree` dependencies of `BasicAAWrapperPass` (along with `DominatorTreeWrapperPass`), and pass the arguments of `init()` herein as additional arguments to `pointsToReadOnlyMemory()`. But this either stands in the way of or is redundant with two of our objectives, which are as follows.
1. Keep as much of our changes -- and their effect -- as possible local to GVN, and only for purposes of load elimination (for now)---at least until we are okay with using post-dominance info more pervasively throughout LLVM.
2. Ensure that both dominator and post-dominator passes are instantiated whenever invariant range analysis is enabled. (Note that we can do the analysis without both passes.)

To achieve Objective #1, we rely on calls to `EnableInvariantRangeAnalysis()` from `GVN::runOnFunction()` which appropriately enable and disable the analysis. This allows us to avoid adding a new field in `BasicAAResults`, and, more generally speaking, to avoid creating situations where implementation details are unnecessarily exposed at function call sites thereby hindering future-proof-ness. (In other words, the `pointsToReadOnlyMemory()` call from basic AA does not have to express that its implementation uses dominance info. That kind of detail should be left to `InvariantInfo` if possible.)

To achieve Objective #2, we rely on `InitDependencies()`, which throws a runtime exception if both passes are not available when invariant range analysis is enabled, and directly sets pointers to the passes in InvariantInfo. This is okay because we have made sure to add `PostDominatorTree` as a dependency of `GVN` that is explicitly required in `getAnalysisUsage()`. We cannot directly make the passes dependencies of `InvariantRangeAnalysisPass` because `InvariantRangeAnalysisPass` is an `ImmutablePass`...

Instead of relying on `InitDependencies()`, could we just not do range analysis if the passes are not available? Sure. But we just would not know if we've erroneously missed opportunities for load elimination.

As per `init`, `InitDependencies` should probably be renamed to `SetDependencies`.

================
Comment at: include/llvm/Analysis/InvariantInfo.h:81
@@ +80,3 @@
+  /// \brief Enables invariant range analysis.
+  void EnableInvariantRangeAnalysis(bool DoEnable = true) {
+    RangeAnalysisIsEnabled = DoEnable;
----------------
chandlerc wrote:
> The style of this patch seems mixed. Sometimes methods start with a lower case, some times an upper case. Please try to consistently follow the LLVM coding standards.
My mistake... I'll fix that.

================
Comment at: include/llvm/Analysis/InvariantInfo.h:85-96
@@ +84,14 @@
+
+  /// \brief Retrieves the part of the invariants mapping that is associated
+  /// with the given function.
+  InvListMapT &getInvariantsMap(const Function &F);
+
+  /// \brief Access the invariants mapping to extract invariant_start/end
+  /// instructions that are associated with the given address.
+  InvListT &getInvariants(const Function &F, const Value *Addr);
+
+  /// \brief Adds an entry to the invariants mapping.
+  void addInvariant(const Function &F, const Value *Addr,
+                    const IntrinsicInst *IStart);
+
+  /// \brief Populates the invariants mapping for the given function.
----------------
chandlerc wrote:
> Why are these public methods? they only appear to be called from within the implementation or from diagnostic utilities.
Right. I'll fix these too. (I was preoccupied with the functional aspect of the implementation.)

================
Comment at: include/llvm/Analysis/InvariantInfo.h:195-218
@@ +194,26 @@
+
+/// \brief The -invariant-range-analysis pass.
+class InvariantRangeAnalysisPass : public ImmutablePass {
+  InvariantInfo InvInfo;
+
+ public:
+  static char ID;  // Pass identification
+  explicit InvariantRangeAnalysisPass();
+  ~InvariantRangeAnalysisPass() override;
+
+  InvariantInfo &getInvariantInfo() { return InvInfo; }
+  const InvariantInfo &getInvariantInfo() const { return InvInfo; }
+
+  void InitDependencies(FunctionPass &P);
+
+  void releaseMemory() override { InvInfo.clear(); }
+  void verifyAnalysis() const override;
+  void getAnalysisUsage(AnalysisUsage &AU) const override;
+  void print(raw_ostream &OS, const Module *) const override;
+  void dump() const;
+  bool doFinalization(Module &) override {
+    verifyAnalysis();
+    return false;
+  }
+};
+
----------------
chandlerc wrote:
> Why use a pass at all? You don't seem to be getting much value from the pass infrastructure currently. It looks like currently, you could just have a function like "populateInfo" which returns an InvariantRangeInfo object after analyzing that function, and the InvariantRangeInfo object could have a method called "pointsToReadOnlyMemory"?
> 
> Do you have specific future changes that would require the pass infrastructure to work? It's entirely possible I'm missing something, but if so, it would be very helpful to explain this kind of detail in the comments.
Ok. This was addressed in at least one previous thread. In principle, the `-invariant-range-analysis` pass is similar to the `-assumption-tracker` pass, except that it enables selective data analysis in addition to data tracking. (`-assumption-tracker` only does data tracking, enabling analysis by other passes.) 

Like `-assumption-tracker`, `-invariant-range-analysis` populates and uses tracked data across different passes, in this case GVN and basic AA. GVN selectively enables the analysis while basic AA requests the analysis to be performed... cf. Objective #1 in earlier comment.

The selective process in GVN could be (incrementaly) adapted in other passes beyond GVN where it makes sense to extend basic AA with invariant range analysis, e.g., places where `pointsToConstantMemory()` is called with `OrLocal == false`.


================
Comment at: include/llvm/Analysis/PostDominators.h:131-148
@@ -114,1 +130,20 @@
 
+/// \brief Analysis pass which computes a \c PostDominatorTree.
+class PostDominatorTreeAnalysis {
+ public:
+  /// \brief Provide the result typedef for this analysis pass.
+  typedef PostDominatorTreeT Result;
+
+  /// \brief Opaque, unique identifier for this analysis pass.
+  static void *ID() { return (void *)&PassID; }
+
+  /// \brief Run the analysis pass over a function and produce a dominator tree.
+  PostDominatorTreeT run(Function &F);
+
+  /// \brief Provide access to a name for this pass for debugging purposes.
+  static StringRef name() { return "PostDominatorTreeAnalysis"; }
+
+ private:
+  static char PassID;
+};
+
----------------
chandlerc wrote:
> Please split the port of postdom to the new pass manager into a separate patch. Even if it is truly necessary, the port shouldn't be part of the patch that introduces the invariant stuff.
Yes. You probably missed this, but when this patch was submitted, the description indicated that the port will have to be done separately...
Indeed, this is a very minimal port, only intended for getting this patch working. The actual port will have to be a bit more substantial than that...
(I am ready to do it as soon as the invariant portion of this patch is agreed upon.)

================
Comment at: lib/Analysis/BasicAliasAnalysis.cpp:492-493
@@ -490,4 +491,4 @@
 
     // A global constant counts as local memory for our purposes.
     if (const GlobalVariable *GV = dyn_cast<GlobalVariable>(V)) {
       // Note: this doesn't require GV to be "ODR" because it isn't legal for a
----------------
I am marking these off because this extension has been moved out of here and into `pointsToReadOnlyMemory()`...

================
Comment at: lib/Analysis/BasicAliasAnalysis.cpp:678
@@ +677,3 @@
+static bool isInvariantIntrinsic(ImmutableCallSite CS) {
+  return IsInvariantIntrinsic(dyn_cast<IntrinsicInst>(CS.getInstruction()));
+}
----------------
chandlerc wrote:
> It isn't clear what value this helper function is providing.
> 
> Is it valid to pass a null pointer here? I find that surprising in and of itself.
> 
> Either way, I think it would be more clear to either teach the routine that is already provided by InvariantInfo.h to accept ImmutableCallSite objects, or to inline the extraction of the instruction into the places this is called below.
Ok.

================
Comment at: lib/Analysis/BasicAliasAnalysis.cpp:742-743
@@ -736,1 +741,4 @@
 
+  // Invariant intrinsics follow the same pattern as assume intrinsic.
+  if (isInvariantIntrinsic(CS)) return MRI_NoModRef;
+
----------------
chandlerc wrote:
> This formatting seems inconsistent. Run clang-format?
Ok. Thought I did, but will try again, and pay closer attention...

================
Comment at: lib/Analysis/BasicAliasAnalysis.cpp:742-743
@@ -736,1 +741,4 @@
 
+  // Invariant intrinsics follow the same pattern as assume intrinsic.
+  if (isInvariantIntrinsic(CS)) return MRI_NoModRef;
+
----------------
lvoufo wrote:
> chandlerc wrote:
> > This formatting seems inconsistent. Run clang-format?
> Ok. Thought I did, but will try again, and pay closer attention...
This is necessary for the test cases below, where load instructions are merged into other load instructions across intrinsic calls.
Consider calling `getModRef(Inst, Loc)` from memdep where `Inst` is an intrinsic_start call.  Without this, `pointsToReadOnlyMemory()` does not know that `Inst` does not modify `Loc.Ptr` if there is `Inst` itself is not already in another invariant range over `Loc.Ptr`.

================
Comment at: lib/Analysis/BasicAliasAnalysis.cpp:768-770
@@ -748,1 +767,5 @@
 
+  // Invariant intrinsics follow the same pattern as assume intrinsic.
+  if (isInvariantIntrinsic(CS1) || isInvariantIntrinsic(CS2))
+    return MRI_NoModRef;
+
----------------
chandlerc wrote:
> You haven't addressed Nick's comment that teaching BasicAA to ignore the invariant intrinsic calls themselves is a separable patch that should be split out and submitted independently. Please do so.
I just addressed Nick's comment above. Sorry I missed it earlier. 

I am not sure that all of the test cases below would still work without this... I think that keeping these around are actually an essential part of the extended logic for load elimination. I'll take another look and see...

================
Comment at: lib/Analysis/InvariantInfo.cpp:36-38
@@ +35,5 @@
+  if (!InvariantsComputed.count(&F)) populateInfo(F);
+  return AllInvariantsMaps[&F];
+}
+
+InvariantInfo::InvListT &InvariantInfo::getInvariants(const Function &F,
----------------
chandlerc wrote:
> I don't really understand your concern here.
> 
> The comments are using imprecise language in that they're not using the standard's precise terminology, but the intent seems clear: it returns whatever "T()" produces. That *always* produces a null pointer when T is a pointer type.
> 
> All of LLVM's code (and much of Clang's) relies on the lookup method being reasonable to call when the result is a pointer. Please follow this convention as suggested by Nick originally.
Right. This comment is outdated by the way... The implementation was very different when at the time of Nick's comment.

================
Comment at: lib/Analysis/InvariantInfo.cpp:89-95
@@ +88,9 @@
+
+/// \brief An instruction cannot modify memory if the instruction is in an
+/// invariant range over the memory's pointer value.
+/// This implementation is modeled after \c pointsToConstantMemory(), with
+/// OrLocal == false, and uses \c queriedFromInvariantRange() to check if
+/// the given instruction modify the given memory.
+/// In practice, this function should be called immediately after
+/// \c pointsToConstantMemory() (with OrLocal == false) returns false.
+bool InvariantInfo::pointsToReadOnlyMemory(const Instruction *I,
----------------
chandlerc wrote:
> You have two different API comments, one in the header, and one in the source. Please only have one comment. For public interfaces, it should be in the header. For private functions, you can put it near the implementation if you want.
Ok. Will do.

================
Comment at: lib/Analysis/InvariantInfo.cpp:139
@@ +138,3 @@
+  } while (!Worklist.empty() && --MaxLookup);
+
+  return Worklist.empty();
----------------
Marking these off because they are outdated...

================
Comment at: lib/Analysis/PostDominators.cpp:52-67
@@ +51,18 @@
+  // post-dominates the instruction.
+  // TODO: Special case for invoke instructions:
+  //       Allow invoke instructions to be post-dominated by instructions in
+  //       their normal destination path to allow further optimizations from
+  //       invariant range analysis such as the elimination of the second load
+  //       instruction in
+  //
+  //          <code>
+  //          try { ...
+  //                @llvm.invariant.start(..., bitcast i32* %i to i8*)
+  //                load %i
+  //                invoke(...)
+  //                load %i
+  //                ...
+  //                @llvm.invariant.end(..., bitcast i32* %i to i8*)
+  //          } catch (...) { ... }
+  //          </code>
+  //
----------------
chandlerc wrote:
> This doesn't seem like it belongs in the postdom pass. Instead if anything it should live somewhere with the invariant handling code.
Yes. This part should remain with this pass, while the rest of the postdom migration to the new pass manager should go in the postdom migration patch.

================
Comment at: lib/Transforms/Scalar/GVN.cpp:2441-2447
@@ +2440,9 @@
+
+    // Enable invariant range analysis here to keep its effect local to GVN.
+    auto &InvRA = getAnalysis<InvariantRangeAnalysisPass>();
+    InvInfo = &InvRA.getInvariantInfo();
+    InvInfo->EnableInvariantRangeAnalysis();
+
+    // Once enabled, initialize the analysis' dependencies.
+    InvRA.InitDependencies(*this);
+  }
----------------
chandlerc wrote:
> This does not seem like the right design and seems over-engineered.
> 
> Here is an alternative that seems much simpler: pass your InvariantRangeAnalysis or InvariantRangeInfo in as an optional parameter to MemoryDependenceAnalysis::getSimplePointerDependencyFrom and the few APIs that call it. Then GVN can opt into building the invariant information structures and passing them into the query and other passes can skip that. Then, within memdep, where we query AA, also query the invariant analysis.
> 
> Perhaps there are other designs as well that would be simpler and more direct? I'm curious what others think.
Hmm... See my earlier reply about Objective #1. This sounds like it'd basically shift the selective process from GVN into memdep, and increase opportunities for errors along the way. Note that GVN does not call `getSimplePointerDependencyFrom` directly. It starts with `getDependency` which eventually calls `getSimplePointerDependencyFrom` through multiple hoops in the call chain.

I prefer to keep things as simple as possible, i.e., enable the analysis in one place and not worry about it again until the analysis is requested in basic AA...

Open for alternative suggestions too.
 


http://reviews.llvm.org/D15124