[PATCH] D41860: [CallSiteSplitting] Support splitting of blocks with instrs before call.

Jun Bum Lim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 1 10:54:12 PST 2018


junbuml added inline comments.


================
Comment at: lib/Transforms/Scalar/CallSiteSplitting.cpp:192-193
   BasicBlock *CallSiteBB = Instr->getParent();
-  if (Instr != CallSiteBB->getFirstNonPHIOrDbg())
+  if (std::distance(CallSiteBB->getFirstNonPHIOrDbg()->getIterator(),
+                    Instr->getIterator()) >= DuplicationThreshold)
     return false;
----------------
Here we assume that the duplication costs of instructions are all same. However, I believe this assumption is not always true because each instruction here will end up differently after lowering. For example, if there is another call before the call we are trying to split, the cost for such call should be more expensive than other simple instructions (e.g., add or trunc ).  If such call duplicated is inlined, the code size could be significantly impacted. I think we should consider using getUserCost(), instead of number of instructions ? 


================
Comment at: lib/Transforms/Scalar/CallSiteSplitting.cpp:313
+        NewPN->addIncoming(Mapping[CurrentI],
+													 cast<Instruction>(Mapping[CurrentI])->getParent());
+      NewPN->insertBefore(&*TailBB->begin());
----------------
If CurrnetI is PHI, "Mapping[CurrentI])->getParent()" may not point the new SplitBlock becasue DuplicateInstructionsInSplitBetween map a PHI with its original incoming value. Isn't it okay to skip doing this if CurrnetI is PHI? 

Based on your test_remove_unused_phi(), we should add a test case something like this : 

```
define void @test_no_remove_used_phi(i32* %ptrarg, i32* %ptrarg2, i32 %i, i32 %i2) {
Header:
  %l1 = load i32, i32* undef, align 16
  %tobool = icmp ne i32* %ptrarg, null
  br i1 %tobool, label %TBB, label %CallSite

TBB:                                    ; preds = %Header
  %arrayidx = getelementptr inbounds i32, i32* %ptrarg, i64 42
  %0 = load i32, i32* %arrayidx, align 4
  %l2 = load i32, i32* undef, align 16
  %tobool1 = icmp ne i32 %0, 0
  br i1 %tobool1, label %CallSite, label %End

CallSite:                                          ; preds = %TBB, %Header
  %l = phi i32 [ %l1, %Header ], [ %l2, %TBB ]
  call void @bar(i32* %ptrarg, i32 %l)
  call void @bari(i32 %l)                   <------------ %l is used here, so it will hit the case I mentioned.
  br label %End

End:                                           ; preds = %CallSite, %TBB
  ret void
}

```


https://reviews.llvm.org/D41860





More information about the llvm-commits mailing list