[PATCH] D13606: [Introduction] Redundant load reduction with invariant intrinsics

Nick Lewycky via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 9 15:16:45 PDT 2015


nlewycky added inline comments.

================
Comment at: include/llvm/IR/GlobalVariable.h:32-33
@@ -31,3 +31,4 @@
 class Constant;
 template <typename ValueSubClass> class SymbolTableListTraits;
+class IntrinsicInst;
 
----------------
Alphabetize.

================
Comment at: include/llvm/IR/GlobalVariable.h:48
@@ -46,1 +47,3 @@
                                                // initializers are run?
+  IntrinsicInst *InvariantStartInst;   // Transient
+
----------------
This is a change to the concept of what a global variable in LLVM IR is, and therefore would require a matching LangRef change (and bitcode reader/writer change).

If you need to store information about a GV for a pass, please create a map<GlobalVariable*, IntrinsicInst*> in your pass.

================
Comment at: include/llvm/IR/Instructions.h:36-37
@@ -35,3 +35,4 @@
 class DataLayout;
 class LLVMContext;
+class IntrinsicInst;
 
----------------
Alphabetize.

================
Comment at: include/llvm/IR/Instructions.h:79
@@ -77,1 +78,3 @@
   Type *AllocatedType;
+  IntrinsicInst *InvariantStartInst;   // Transient.
+
----------------
Again, this is a fundamental change to what an alloca instruction is.

Question for you, what if there's more than one invariant call on this alloca?

Note that this doesn't form a use, so what if someone deletes the invariant intrinsic? This is left hanging and pointing to nothing, or worse someone could reallocate a new different instruction at the same address.

================
Comment at: lib/Analysis/BasicAliasAnalysis.cpp:500
@@ -495,3 +499,3 @@
       // others.  GV may even be a declaration, not a definition.
-      if (!GV->isConstant()) {
+      if (!GV->isConstant()  && !GV->getInvariantStartInstruction()) {
         Visited.clear();
----------------
What if the GV has an invariant start instruction, but the query is from before the GV initializer has finished running? A "false negative" in this if-expression fails unsafe, skipping this block will cause us to go into code which is written to assume GV->isConstant is true. If that were sufficient, then there's no need to have getInvariantStartInstruction at all, you could just set isConstant instead.

You need to check that the invariant has already executed, and the way to do that is to see that it dominates (which currently we can only do in a function-local sense). I don't know what cases you're solving this way, but my stab-in-the-dark guess is that you should probably be getting them with changes to MemoryDependenceAnalysis instead; if you see an invariant.end you can skip the scan of instructions up to the matching invariant.start which it conveniently takes as its first argument. That way, you never even get into the point of querying basicaa about the instructions in between, therefore they don't appear to be potentially conflicting for any user of memdep (GVN, DSE, Memcpyopt).

================
Comment at: lib/IR/Instructions.cpp:1155-1158
@@ -1152,1 +1154,6 @@
 
+void AllocaInst::checkInvariantStartInstruction(IntrinsicInst *II) {
+  assert((!II || II->getIntrinsicID() == Intrinsic::invariant_start) &&
+      "Given intrinsic instruction is not invariant_start");
+}
+
----------------
What about II->getParent() == this->getParent()?

================
Comment at: lib/Transforms/IPO/GlobalOpt.cpp:2247-2250
@@ -2239,2 +2246,6 @@
 
+  /// ReadOnlys - These global variables are writeonce variables that
+  /// have been marked readonly by the static constructor.
+  SmallPtrSet<GlobalVariable*, 8> ReadOnlys;
+
   /// SimpleConstants - These are constants we have checked and know to be
----------------
I don't think you need this?

GlobalOpt performs symbolic execution of each instruction in order. It will see, and "execute" an invariant start instruction by inserting it into the "Invariants" set above. This is a more accurate model, since before the execution occurs it is not yet constant, and it becomes constant afterwards. 

I'm not sure what cases processInvariantIntrinsics is catching which the existing code is not, and I'd like to review the GlobalOpt changes independently.

================
Comment at: lib/Transforms/IPO/GlobalOpt.cpp:2640-2641
@@ +2639,4 @@
+
+  // Scan the block to process invariant intrinsics, tracing whatever
+  // call chain that can be traced.
+  // Without this, invariant intrinsics on global variables, can only be
----------------
This is what the Evaluator does (except not just for invariant intrinsics). This code is a second evaluator inside the evaluator, please don't do that, just integrate with the rest of GlobalOpt.

================
Comment at: lib/Transforms/IPO/GlobalOpt.cpp:2646-2648
@@ +2645,5 @@
+  // or -O2's -inline?
+  BasicBlock *BB = F->begin();
+  while (BB) {
+    BasicBlock* NextBB = nullptr;
+    BasicBlock::iterator CurInst = BB->begin();
----------------
  for (Function::iterator BB = F->begin(), BE = F->end(); BB != BE; ++BB) {
    // Function::iterator implicitly converts into a BasicBlock*, so "BB->..." works fine
  }


================
Comment at: lib/Transforms/IPO/GlobalOpt.cpp:2648
@@ +2647,3 @@
+  while (BB) {
+    BasicBlock* NextBB = nullptr;
+    BasicBlock::iterator CurInst = BB->begin();
----------------
"BasicBlock* NextBB" --> "BasicBlock *NextBB".

================
Comment at: lib/Transforms/IPO/GlobalOpt.cpp:2683
@@ +2682,3 @@
+            if (ConstantInt *Cond =
+                dyn_cast_or_null<ConstantInt>(Eval.getVal(BI->getCondition(),
+                                              /*CheckComputed =*/ false)))
----------------
So the only things Eval.getVal() will return non-null for are Arguments and GlobalValues, because you've entered processInvariantIntrinsics() before evaluating the function. Good news, that's not wrong, it just doesn't have the values.

The alternative of running it afterwards would fill in the values, but be actively wrong in a case where you're inside a loop; a loop value incrementing %x from 0 to 5 would show as '5' even if it were used inside the loop.

================
Comment at: lib/Transforms/InstCombine/InstructionCombining.cpp:1966
@@ +1965,3 @@
+              dyn_cast<IntrinsicInst>(cast<Instruction>(*I->user_begin()));
+          assert(I->hasOneUse() && User &&
+                 User->getIntrinsicID() == Intrinsic::invariant_end &&
----------------
I don't think that's right. Proposed counterexample:

  declare {}* @llvm.invariant.start(i64 %size, i8* nocapture %ptr)
  declare void @llvm.invariant.end({}* %start, i64 %size, i8* nocapture %ptr)
  declare i8* @malloc(i32)
  define void @foo() {
    %p = call i8* @malloc(i32 1)
    %inv = call {}* @llvm.invariant.start(i64 -1, i8* %p)
    %dead = icmp eq i8* %p, i8* null
    call void @llvm.invariant.end({}* %inv, i64 -1, i8* %p)
    ret void
  }

If "isAllocSiteRemovable" is true, we should just delete invariant starts and ends. There may be a pile of them, like:

    %inv1 = call {}* @llvm.invariant.start(i64 -1, i8* %p)
    call void @llvm.invariant.end({}* %inv1, i64 -1, i8* %p)
    %inv2 = call {}* @llvm.invariant.start(i64 -1, i8* %p)
    call void @llvm.invariant.end({}* %inv2, i64 -1, i8* %p)
    %inv3 = call {}* @llvm.invariant.start(i64 -1, i8* %p)
    call void @llvm.invariant.end({}* %inv3, i64 -1, i8* %p)

and we don't care since we already checked that it's safe to delete them all. Just nuke away!

Also, don't dyn_cast + assert the result, just use cast<>. It has the assert inside.


http://reviews.llvm.org/D13606





More information about the llvm-commits mailing list