<div dir="ltr">These leaks were also caused by the patch <a href="http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/4951/steps/check-llvm%20asan/logs/stdio">http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/4951/steps/check-llvm%20asan/logs/stdio</a></div><br><div class="gmail_quote"><div dir="ltr">On Sat, May 13, 2017 at 8:07 PM Xinliang David Li via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">thanks for the report. I committed a fix.</div><div dir="ltr"><div><br></div><div>David</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, May 13, 2017 at 6:22 PM, Zachary Turner <span dir="ltr"><<a href="mailto:zturner@google.com" target="_blank">zturner@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">This is causing crashes on my Windows machine.<div><br></div><div><div>FAIL: LLVM :: Transforms/Inline/partial-inline-act.ll (1 of 1)</div><div>******************** TEST 'LLVM :: Transforms/Inline/partial-inline-act.ll' FAILED ********************</div><div>Script:</div><div>--</div><div>C:/src/llvmbuild/ninja/./bin\opt.EXE < C:\src\llvm-mono\llvm\test\Transforms\Inline\partial-inline-act.ll -partial-inliner -disable-output</div><div>--</div><div>Exit Code: -1073741819</div><div><br></div><div>Command Output (stdout):</div><div>--</div><div>$ "C:/src/llvmbuild/ninja/./bin\opt.EXE" "-partial-inliner" "-disable-output"</div><div># command stderr:</div><div>#0 0x020522b1 llvm::PointerIntPair<class llvm::Use * *,2,enum llvm::Use::PrevPtrTag,struct llvm::Use::PrevPointerTraits,struct llvm::PointerIntPairInfo<class llvm::Use * *,2,struct llvm::Use::PrevPointerTraits> >::getInt(void)const c:\src\llvm-mono\llvm\include\llvm\adt\pointerintpair.h:59:0</div><div>#1 0x02051fc3 llvm::Use::getImpliedUser(void)const c:\src\llvm-mono\llvm\lib\ir\use.cpp:98:0</div><div>#2 0x02051d42 llvm::Use::getUser(void)const c:\src\llvm-mono\llvm\lib\ir\use.cpp:42:0</div><div>#3 0x011ac33d llvm::Value::user_iterator_impl<class llvm::User>::operator*(void)const c:\src\llvm-mono\llvm\include\llvm\ir\value.h:189:0</div><div>#4 0x025230a4 `anonymous namespace'::PartialInlinerImpl::run c:\src\llvm-mono\llvm\lib\transforms\ipo\partialinlining.cpp:829:0</div><div>#5 0x0132d19f `anonymous namespace'::PartialInlinerLegacyPass::runOnModule c:\src\llvm-mono\llvm\lib\transforms\ipo\partialinlining.cpp:220:0</div><div>#6 0x0209149e `anonymous namespace'::MPPassManager::runOnModule c:\src\llvm-mono\llvm\lib\ir\legacypassmanager.cpp:1590:0</div><div>#7 0x02091b7a llvm::legacy::PassManagerImpl::run(class llvm::Module &) c:\src\llvm-mono\llvm\lib\ir\legacypassmanager.cpp:1693:0</div><div>#8 0x0208c1dd llvm::legacy::PassManager::run(class llvm::Module &) c:\src\llvm-mono\llvm\lib\ir\legacypassmanager.cpp:1725:0</div><div>#9 0x011ec251 main c:\src\llvm-mono\llvm\tools\opt\opt.cpp:742:0</div><div>#10 0x0348ddde invoke_main f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:64:0</div><div>#11 0x0348dc60 _scrt_common_main_seh f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:253:0</div><div>#12 0x0348dafd _scrt_common_main f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:296:0</div><div>#13 0x0348ddf8 mainCRTStartup f:\dd\vctools\crt\vcstartup\src\startup\exe_main.cpp:17:0</div><div>#14 0x77118744 (C:\WINDOWS\System32\KERNEL32.DLL+0x18744)</div><div>#15 0x774d587d (C:\WINDOWS\SYSTEM32\ntdll.dll+0x6587d)</div><div>#16 0x774d584d (C:\WINDOWS\SYSTEM32\ntdll.dll+0x6584d)</div><div><br></div><div>error: command failed with exit status: 0xc0000005</div></div><div><br></div><div><br></div><div>Anotehr test is crashing with a similar issue. Any thoughts on how to fix this?</div></div><br><div class="gmail_quote"><div dir="ltr">On Fri, May 12, 2017 at 4:54 PM Xinliang David Li via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: davidxl<br>
Date: Fri May 12 18:41:43 2017<br>
New Revision: 302967<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=302967&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=302967&view=rev</a><br>
Log:<br>
[PartialInlining] Profile based cost analysis<br>
<br>
Implemented frequency based cost/saving analysis<br>
and related options.<br>
<br>
The pass is now in a state ready to be turne on<br>
in the pipeline (in follow up).<br>
<br>
Differential Revision: <a href="http://reviews.llvm.org/D32783" rel="noreferrer" target="_blank">http://reviews.llvm.org/D32783</a><br>
<br>
Added:<br>
llvm/trunk/test/Transforms/CodeExtractor/PartialInlineEntryUpdate.ll<br>
llvm/trunk/test/Transforms/CodeExtractor/PartialInlineHighCost.ll<br>
Modified:<br>
llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp<br>
llvm/trunk/test/Transforms/CodeExtractor/ExtractedFnEntryCount.ll<br>
llvm/trunk/test/Transforms/CodeExtractor/MultipleExitBranchProb.ll<br>
llvm/trunk/test/Transforms/CodeExtractor/PartialInlineAnd.ll<br>
llvm/trunk/test/Transforms/CodeExtractor/PartialInlineOr.ll<br>
llvm/trunk/test/Transforms/CodeExtractor/PartialInlineOrAnd.ll<br>
llvm/trunk/test/Transforms/CodeExtractor/SingleCondition.ll<br>
llvm/trunk/test/Transforms/CodeExtractor/X86/InheritTargetAttributes.ll<br>
<br>
Modified: llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp?rev=302967&r1=302966&r2=302967&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp?rev=302967&r1=302966&r2=302967&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp (original)<br>
+++ llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp Fri May 12 18:41:43 2017<br>
@@ -16,6 +16,7 @@<br>
#include "llvm/ADT/Statistic.h"<br>
#include "llvm/Analysis/BlockFrequencyInfo.h"<br>
#include "llvm/Analysis/BranchProbabilityInfo.h"<br>
+#include "llvm/Analysis/CodeMetrics.h"<br>
#include "llvm/Analysis/InlineCost.h"<br>
#include "llvm/Analysis/LoopInfo.h"<br>
#include "llvm/Analysis/OptimizationDiagnosticInfo.h"<br>
@@ -42,6 +43,11 @@ STATISTIC(NumPartialInlined,<br>
static cl::opt<bool><br>
DisablePartialInlining("disable-partial-inlining", cl::init(false),<br>
cl::Hidden, cl::desc("Disable partial ininling"));<br>
+// This is an option used by testing:<br>
+static cl::opt<bool> SkipCostAnalysis("skip-partial-inlining-cost-analysis",<br>
+ cl::init(false), cl::ZeroOrMore,<br>
+ cl::ReallyHidden,<br>
+ cl::desc("Skip Cost Analysis"));<br>
<br>
static cl::opt<unsigned> MaxNumInlineBlocks(<br>
"max-num-inline-blocks", cl::init(5), cl::Hidden,<br>
@@ -53,6 +59,15 @@ static cl::opt<int> MaxNumPartialInlinin<br>
"max-partial-inlining", cl::init(-1), cl::Hidden, cl::ZeroOrMore,<br>
cl::desc("Max number of partial inlining. The default is unlimited"));<br>
<br>
+// Used only when PGO or user annotated branch data is absent. It is<br>
+// the least value that is used to weigh the outline region. If BFI<br>
+// produces larger value, the BFI value will be used.<br>
+static cl::opt<int><br>
+ OutlineRegionFreqPercent("outline-region-freq-percent", cl::init(75),<br>
+ cl::Hidden, cl::ZeroOrMore,<br>
+ cl::desc("Relative frequency of outline region to "<br>
+ "the entry block"));<br>
+<br>
namespace {<br>
<br>
struct FunctionOutliningInfo {<br>
@@ -84,8 +99,6 @@ struct PartialInlinerImpl {<br>
bool run(Module &M);<br>
Function *unswitchFunction(Function *F);<br>
<br>
- std::unique_ptr<FunctionOutliningInfo> computeOutliningInfo(Function *F);<br>
-<br>
private:<br>
int NumPartialInlining = 0;<br>
std::function<AssumptionCache &(Function &)> *GetAssumptionCache;<br>
@@ -93,11 +106,84 @@ private:<br>
Optional<function_ref<BlockFrequencyInfo &(Function &)>> GetBFI;<br>
ProfileSummaryInfo *PSI;<br>
<br>
- bool shouldPartialInline(CallSite CS, OptimizationRemarkEmitter &ORE);<br>
+ // Return the frequency of the OutlininingBB relative to F's entry point.<br>
+ // The result is no larger than 1 and is represented using BP.<br>
+ // (Note that the outlined region's 'head' block can only have incoming<br>
+ // edges from the guarding entry blocks).<br>
+ BranchProbability getOutliningCallBBRelativeFreq(Function *F,<br>
+ FunctionOutliningInfo *OI,<br>
+ Function *DuplicateFunction,<br>
+ BlockFrequencyInfo *BFI,<br>
+ BasicBlock *OutliningCallBB);<br>
+<br>
+ // Return true if the callee of CS should be partially inlined with<br>
+ // profit.<br>
+ bool shouldPartialInline(CallSite CS, Function *F, FunctionOutliningInfo *OI,<br>
+ BlockFrequencyInfo *CalleeBFI,<br>
+ BasicBlock *OutliningCallBB,<br>
+ int OutliningCallOverhead,<br>
+ OptimizationRemarkEmitter &ORE);<br>
+<br>
+ // Try to inline DuplicateFunction (cloned from F with call to<br>
+ // the OutlinedFunction into its callers. Return true<br>
+ // if there is any successful inlining.<br>
+ bool tryPartialInline(Function *DuplicateFunction,<br>
+ Function *F, /*orignal function */<br>
+ FunctionOutliningInfo *OI, Function *OutlinedFunction,<br>
+ BlockFrequencyInfo *CalleeBFI);<br>
+<br>
+ // Compute the mapping from use site of DuplicationFunction to the enclosing<br>
+ // BB's profile count.<br>
+ void computeCallsiteToProfCountMap(Function *DuplicateFunction,<br>
+ DenseMap<User *, uint64_t> &SiteCountMap);<br>
+<br>
bool IsLimitReached() {<br>
return (MaxNumPartialInlining != -1 &&<br>
NumPartialInlining >= MaxNumPartialInlining);<br>
}<br>
+<br>
+ CallSite getCallSite(User *U) {<br>
+ CallSite CS;<br>
+ if (CallInst *CI = dyn_cast<CallInst>(U))<br>
+ CS = CallSite(CI);<br>
+ else if (InvokeInst *II = dyn_cast<InvokeInst>(U))<br>
+ CS = CallSite(II);<br>
+ else<br>
+ llvm_unreachable("All uses must be calls");<br>
+ return CS;<br>
+ }<br>
+<br>
+ CallSite getOneCallSiteTo(Function *F) {<br>
+ User *User = *F->user_begin();<br>
+ return getCallSite(User);<br>
+ }<br>
+<br>
+ std::tuple<DebugLoc, BasicBlock *> getOneDebugLoc(Function *F) {<br>
+ CallSite CS = getOneCallSiteTo(F);<br>
+ DebugLoc DLoc = CS.getInstruction()->getDebugLoc();<br>
+ BasicBlock *Block = CS.getParent();<br>
+ return std::make_tuple(DLoc, Block);<br>
+ }<br>
+<br>
+ // Returns the costs associated with function outlining:<br>
+ // - The first value is the non-weighted runtime cost for making the call<br>
+ // to the outlined function 'OutlinedFunction', including the addtional<br>
+ // setup cost in the outlined function itself;<br>
+ // - The second value is the estimated size of the new call sequence in<br>
+ // basic block 'OutliningCallBB';<br>
+ // - The third value is the estimated size of the original code from<br>
+ // function 'F' that is extracted into the outlined function.<br>
+ std::tuple<int, int, int><br>
+ computeOutliningCosts(Function *F, const FunctionOutliningInfo *OutliningInfo,<br>
+ Function *OutlinedFunction,<br>
+ BasicBlock *OutliningCallBB);<br>
+ // Compute the 'InlineCost' of block BB. InlineCost is a proxy used to<br>
+ // approximate both the size and runtime cost (Note that in the current<br>
+ // inline cost analysis, there is no clear distinction there either).<br>
+ int computeBBInlineCost(BasicBlock *BB);<br>
+<br>
+ std::unique_ptr<FunctionOutliningInfo> computeOutliningInfo(Function *F);<br>
+<br>
};<br>
<br>
struct PartialInlinerLegacyPass : public ModulePass {<br>
@@ -223,7 +309,8 @@ PartialInlinerImpl::computeOutliningInfo<br>
// Do sanity check of the entries: threre should not<br>
// be any successors (not in the entry set) other than<br>
// {ReturnBlock, NonReturnBlock}<br>
- assert(OutliningInfo->Entries[0] == &F->front());<br>
+ assert(OutliningInfo->Entries[0] == &F->front() &&<br>
+ "Function Entry must be the first in Entries vector");<br>
DenseSet<BasicBlock *> Entries;<br>
for (BasicBlock *E : OutliningInfo->Entries)<br>
Entries.insert(E);<br>
@@ -289,10 +376,54 @@ PartialInlinerImpl::computeOutliningInfo<br>
return OutliningInfo;<br>
}<br>
<br>
-bool PartialInlinerImpl::shouldPartialInline(CallSite CS,<br>
- OptimizationRemarkEmitter &ORE) {<br>
- // TODO : more sharing with shouldInline in Inliner.cpp<br>
+// Check if there is PGO data or user annoated branch data:<br>
+static bool hasProfileData(Function *F, FunctionOutliningInfo *OI) {<br>
+ if (F->getEntryCount())<br>
+ return true;<br>
+ // Now check if any of the entry block has MD_prof data:<br>
+ for (auto *E : OI->Entries) {<br>
+ BranchInst *BR = dyn_cast<BranchInst>(E->getTerminator());<br>
+ if (!BR || BR->isUnconditional())<br>
+ continue;<br>
+ uint64_t T, F;<br>
+ if (BR->extractProfMetadata(T, F))<br>
+ return true;<br>
+ }<br>
+ return false;<br>
+}<br>
+<br>
+BranchProbability PartialInlinerImpl::getOutliningCallBBRelativeFreq(<br>
+ Function *F, FunctionOutliningInfo *OI, Function *DuplicateFunction,<br>
+ BlockFrequencyInfo *BFI, BasicBlock *OutliningCallBB) {<br>
+<br>
+ auto EntryFreq =<br>
+ BFI->getBlockFreq(&DuplicateFunction->getEntryBlock());<br>
+ auto OutliningCallFreq = BFI->getBlockFreq(OutliningCallBB);<br>
+<br>
+ auto OutlineRegionRelFreq =<br>
+ BranchProbability::getBranchProbability(OutliningCallFreq.getFrequency(),<br>
+ EntryFreq.getFrequency());<br>
+<br>
+ if (hasProfileData(F, OI))<br>
+ return OutlineRegionRelFreq;<br>
+<br>
+ // When profile data is not available, we need to be very<br>
+ // conservative in estimating the overall savings. We need to make sure<br>
+ // the outline region relative frequency is not below the threshold<br>
+ // specified by the option.<br>
+ OutlineRegionRelFreq = std::max(OutlineRegionRelFreq, BranchProbability(OutlineRegionFreqPercent, 100));<br>
+<br>
+ return OutlineRegionRelFreq;<br>
+}<br>
+<br>
+bool PartialInlinerImpl::shouldPartialInline(<br>
+ CallSite CS, Function *F /* Original Callee */, FunctionOutliningInfo *OI,<br>
+ BlockFrequencyInfo *CalleeBFI, BasicBlock *OutliningCallBB,<br>
+ int NonWeightedOutliningRcost, OptimizationRemarkEmitter &ORE) {<br>
using namespace ore;<br>
+ if (SkipCostAnalysis)<br>
+ return true;<br>
+<br>
Instruction *Call = CS.getInstruction();<br>
Function *Callee = CS.getCalledFunction();<br>
Function *Caller = CS.getCaller();<br>
@@ -302,36 +433,166 @@ bool PartialInlinerImpl::shouldPartialIn<br>
<br>
if (IC.isAlways()) {<br>
ORE.emit(OptimizationRemarkAnalysis(DEBUG_TYPE, "AlwaysInline", Call)<br>
- << NV("Callee", Callee)<br>
+ << NV("Callee", F)<br>
<< " should always be fully inlined, not partially");<br>
return false;<br>
}<br>
<br>
if (IC.isNever()) {<br>
ORE.emit(OptimizationRemarkMissed(DEBUG_TYPE, "NeverInline", Call)<br>
- << NV("Callee", Callee) << " not partially inlined into "<br>
+ << NV("Callee", F) << " not partially inlined into "<br>
<< NV("Caller", Caller)<br>
<< " because it should never be inlined (cost=never)");<br>
return false;<br>
}<br>
<br>
if (!IC) {<br>
- ORE.emit(OptimizationRemarkMissed(DEBUG_TYPE, "TooCostly", Call)<br>
- << NV("Callee", Callee) << " not partially inlined into "<br>
+ ORE.emit(OptimizationRemarkAnalysis(DEBUG_TYPE, "TooCostly", Call)<br>
+ << NV("Callee", F) << " not partially inlined into "<br>
<< NV("Caller", Caller) << " because too costly to inline (cost="<br>
<< NV("Cost", IC.getCost()) << ", threshold="<br>
<< NV("Threshold", IC.getCostDelta() + IC.getCost()) << ")");<br>
return false;<br>
}<br>
+ const DataLayout &DL = Caller->getParent()->getDataLayout();<br>
+ // The savings of eliminating the call:<br>
+ int NonWeightedSavings = getCallsiteCost(CS, DL);<br>
+ BlockFrequency NormWeightedSavings(NonWeightedSavings);<br>
+<br>
+ auto RelativeFreq =<br>
+ getOutliningCallBBRelativeFreq(F, OI, Callee, CalleeBFI, OutliningCallBB);<br>
+ auto NormWeightedRcost =<br>
+ BlockFrequency(NonWeightedOutliningRcost) * RelativeFreq;<br>
+<br>
+ // Weighted saving is smaller than weighted cost, return false<br>
+ if (NormWeightedSavings < NormWeightedRcost) {<br>
+ ORE.emit(<br>
+ OptimizationRemarkAnalysis(DEBUG_TYPE, "OutliningCallcostTooHigh", Call)<br>
+ << NV("Callee", F) << " not partially inlined into "<br>
+ << NV("Caller", Caller) << " runtime overhead (overhead="<br>
+ << NV("Overhead", (unsigned)NormWeightedRcost.getFrequency())<br>
+ << ", savings="<br>
+ << NV("Savings", (unsigned)NormWeightedSavings.getFrequency()) << ")"<br>
+ << " of making the outlined call is too high");<br>
+<br>
+ return false;<br>
+ }<br>
<br>
ORE.emit(OptimizationRemarkAnalysis(DEBUG_TYPE, "CanBePartiallyInlined", Call)<br>
- << NV("Callee", Callee) << " can be partially inlined into "<br>
+ << NV("Callee", F) << " can be partially inlined into "<br>
<< NV("Caller", Caller) << " with cost=" << NV("Cost", IC.getCost())<br>
<< " (threshold="<br>
<< NV("Threshold", IC.getCostDelta() + IC.getCost()) << ")");<br>
return true;<br>
}<br>
<br>
+// TODO: Ideally we should share Inliner's InlineCost Analysis code.<br>
+// For now use a simplified version. The returned 'InlineCost' will be used<br>
+// to esimate the size cost as well as runtime cost of the BB.<br>
+int PartialInlinerImpl::computeBBInlineCost(BasicBlock *BB) {<br>
+ int InlineCost = 0;<br>
+ const DataLayout &DL = BB->getParent()->getParent()->getDataLayout();<br>
+ for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ++I) {<br>
+ if (isa<DbgInfoIntrinsic>(I))<br>
+ continue;<br>
+<br>
+ if (CallInst *CI = dyn_cast<CallInst>(I)) {<br>
+ InlineCost += getCallsiteCost(CallSite(CI), DL);<br>
+ continue;<br>
+ }<br>
+<br>
+ if (InvokeInst *II = dyn_cast<InvokeInst>(I)) {<br>
+ InlineCost += getCallsiteCost(CallSite(II), DL);<br>
+ continue;<br>
+ }<br>
+<br>
+ if (SwitchInst *SI = dyn_cast<SwitchInst>(I)) {<br>
+ InlineCost += (SI->getNumCases() + 1) * InlineConstants::InstrCost;<br>
+ continue;<br>
+ }<br>
+ InlineCost += InlineConstants::InstrCost;<br>
+ }<br>
+ return InlineCost;<br>
+}<br>
+<br>
+std::tuple<int, int, int> PartialInlinerImpl::computeOutliningCosts(<br>
+ Function *F, const FunctionOutliningInfo *OI, Function *OutlinedFunction,<br>
+ BasicBlock *OutliningCallBB) {<br>
+ // First compute the cost of the outlined region 'OI' in the original<br>
+ // function 'F':<br>
+ int OutlinedRegionCost = 0;<br>
+ for (BasicBlock &BB : *F) {<br>
+ if (&BB != OI->ReturnBlock &&<br>
+ // Assuming Entry set is small -- do a linear search here:<br>
+ std::find(OI->Entries.begin(), OI->Entries.end(), &BB) ==<br>
+ OI->Entries.end()) {<br>
+ OutlinedRegionCost += computeBBInlineCost(&BB);<br>
+ }<br>
+ }<br>
+<br>
+ // Now compute the cost of the call sequence to the outlined function<br>
+ // 'OutlinedFunction' in BB 'OutliningCallBB':<br>
+ int OutliningFuncCallCost = computeBBInlineCost(OutliningCallBB);<br>
+<br>
+ // Now compute the cost of the extracted/outlined function itself:<br>
+ int OutlinedFunctionCost = 0;<br>
+ for (BasicBlock &BB : *OutlinedFunction) {<br>
+ OutlinedFunctionCost += computeBBInlineCost(&BB);<br>
+ }<br>
+<br>
+ assert(OutlinedFunctionCost >= OutlinedRegionCost &&<br>
+ "Outlined function cost should be no less than the outlined region");<br>
+ int OutliningRuntimeOverhead =<br>
+ OutliningFuncCallCost + (OutlinedFunctionCost - OutlinedRegionCost);<br>
+<br>
+ return std::make_tuple(OutliningFuncCallCost, OutliningRuntimeOverhead,<br>
+ OutlinedRegionCost);<br>
+}<br>
+<br>
+// Create the callsite to profile count map which is<br>
+// used to update the original function's entry count,<br>
+// after the function is partially inlined into the callsite.<br>
+void PartialInlinerImpl::computeCallsiteToProfCountMap(<br>
+ Function *DuplicateFunction,<br>
+ DenseMap<User *, uint64_t> &CallSiteToProfCountMap) {<br>
+ std::vector<User *> Users(DuplicateFunction->user_begin(),<br>
+ DuplicateFunction->user_end());<br>
+ Function *CurrentCaller = nullptr;<br>
+ BlockFrequencyInfo *CurrentCallerBFI = nullptr;<br>
+<br>
+ auto ComputeCurrBFI = [&,this](Function *Caller) {<br>
+ // For the old pass manager:<br>
+ if (!GetBFI) {<br>
+ if (CurrentCallerBFI)<br>
+ delete CurrentCallerBFI;<br>
+ DominatorTree DT(*Caller);<br>
+ LoopInfo LI(DT);<br>
+ BranchProbabilityInfo BPI(*Caller, LI);<br>
+ CurrentCallerBFI = new BlockFrequencyInfo(*Caller, BPI, LI);<br>
+ } else {<br>
+ // New pass manager:<br>
+ CurrentCallerBFI = &(*GetBFI)(*Caller);<br>
+ }<br>
+ };<br>
+<br>
+ for (User *User : Users) {<br>
+ CallSite CS = getCallSite(User);<br>
+ Function *Caller = CS.getCaller();<br>
+ if (CurrentCaller != Caller) {<br>
+ CurrentCaller = Caller;<br>
+ ComputeCurrBFI(Caller);<br>
+ } else {<br>
+ assert(CurrentCallerBFI && "CallerBFI is not set");<br>
+ }<br>
+ BasicBlock *CallBB = CS.getInstruction()->getParent();<br>
+ auto Count = CurrentCallerBFI->getBlockProfileCount(CallBB);<br>
+ if (Count)<br>
+ CallSiteToProfCountMap[User] = *Count;<br>
+ else<br>
+ CallSiteToProfCountMap[User] = 0;<br>
+ }<br>
+}<br>
+<br>
Function *PartialInlinerImpl::unswitchFunction(Function *F) {<br>
<br>
if (F->hasAddressTaken())<br>
@@ -347,21 +608,21 @@ Function *PartialInlinerImpl::unswitchFu<br>
if (PSI->isFunctionEntryCold(F))<br>
return nullptr;<br>
<br>
- std::unique_ptr<FunctionOutliningInfo> OutliningInfo =<br>
- computeOutliningInfo(F);<br>
+ if (F->user_begin() == F->user_end())<br>
+ return nullptr;<br>
<br>
- if (!OutliningInfo)<br>
+ std::unique_ptr<FunctionOutliningInfo> OI = computeOutliningInfo(F);<br>
+<br>
+ if (!OI)<br>
return nullptr;<br>
<br>
// Clone the function, so that we can hack away on it.<br>
ValueToValueMapTy VMap;<br>
Function *DuplicateFunction = CloneFunction(F, VMap);<br>
- BasicBlock *NewReturnBlock =<br>
- cast<BasicBlock>(VMap[OutliningInfo->ReturnBlock]);<br>
- BasicBlock *NewNonReturnBlock =<br>
- cast<BasicBlock>(VMap[OutliningInfo->NonReturnBlock]);<br>
+ BasicBlock *NewReturnBlock = cast<BasicBlock>(VMap[OI->ReturnBlock]);<br>
+ BasicBlock *NewNonReturnBlock = cast<BasicBlock>(VMap[OI->NonReturnBlock]);<br>
DenseSet<BasicBlock *> NewEntries;<br>
- for (BasicBlock *BB : OutliningInfo->Entries) {<br>
+ for (BasicBlock *BB : OI->Entries) {<br>
NewEntries.insert(cast<BasicBlock>(VMap[BB]));<br>
}<br>
<br>
@@ -390,7 +651,7 @@ Function *PartialInlinerImpl::unswitchFu<br>
BasicBlock *PreReturn = NewReturnBlock;<br>
// only split block when necessary:<br>
PHINode *FirstPhi = getFirstPHI(PreReturn);<br>
- unsigned NumPredsFromEntries = OutliningInfo->ReturnBlockPreds.size();<br>
+ unsigned NumPredsFromEntries = OI->ReturnBlockPreds.size();<br>
if (FirstPhi && FirstPhi->getNumIncomingValues() > NumPredsFromEntries + 1) {<br>
<br>
NewReturnBlock = NewReturnBlock->splitBasicBlock(<br>
@@ -408,14 +669,14 @@ Function *PartialInlinerImpl::unswitchFu<br>
Ins = NewReturnBlock->getFirstNonPHI();<br>
<br>
RetPhi->addIncoming(&*I, PreReturn);<br>
- for (BasicBlock *E : OutliningInfo->ReturnBlockPreds) {<br>
+ for (BasicBlock *E : OI->ReturnBlockPreds) {<br>
BasicBlock *NewE = cast<BasicBlock>(VMap[E]);<br>
RetPhi->addIncoming(OldPhi->getIncomingValueForBlock(NewE), NewE);<br>
OldPhi->removeIncomingValue(NewE);<br>
}<br>
++I;<br>
}<br>
- for (auto E : OutliningInfo->ReturnBlockPreds) {<br>
+ for (auto E : OI->ReturnBlockPreds) {<br>
BasicBlock *NewE = cast<BasicBlock>(VMap[E]);<br>
NewE->getTerminator()->replaceUsesOfWith(PreReturn, NewReturnBlock);<br>
}<br>
@@ -443,50 +704,107 @@ Function *PartialInlinerImpl::unswitchFu<br>
BlockFrequencyInfo BFI(*DuplicateFunction, BPI, LI);<br>
<br>
// Extract the body of the if.<br>
- Function *ExtractedFunction =<br>
+ Function *OutlinedFunction =<br>
CodeExtractor(ToExtract, &DT, /*AggregateArgs*/ false, &BFI, &BPI)<br>
.extractCodeRegion();<br>
<br>
- // Inline the top-level if test into all callers.<br>
+ bool AnyInline =<br>
+ tryPartialInline(DuplicateFunction, F, OI.get(), OutlinedFunction, &BFI);<br>
+<br>
+ // Ditch the duplicate, since we're done with it, and rewrite all remaining<br>
+ // users (function pointers, etc.) back to the original function.<br>
+ DuplicateFunction->replaceAllUsesWith(F);<br>
+ DuplicateFunction->eraseFromParent();<br>
+ if (!AnyInline && OutlinedFunction)<br>
+ OutlinedFunction->eraseFromParent();<br>
+ return OutlinedFunction;<br>
+}<br>
+<br>
+bool PartialInlinerImpl::tryPartialInline(Function *DuplicateFunction,<br>
+ Function *F,<br>
+ FunctionOutliningInfo *OI,<br>
+ Function *OutlinedFunction,<br>
+ BlockFrequencyInfo *CalleeBFI) {<br>
+ if (OutlinedFunction == nullptr)<br>
+ return false;<br>
+<br>
+ int NonWeightedRcost;<br>
+ int SizeCost;<br>
+ int OutlinedRegionSizeCost;<br>
+<br>
+ auto OutliningCallBB =<br>
+ getOneCallSiteTo(OutlinedFunction).getInstruction()->getParent();<br>
+<br>
+ std::tie(SizeCost, NonWeightedRcost, OutlinedRegionSizeCost) =<br>
+ computeOutliningCosts(F, OI, OutlinedFunction, OutliningCallBB);<br>
+<br>
+ // The call sequence to the outlined function is larger than the original<br>
+ // outlined region size, it does not increase the chances of inlining<br>
+ // 'F' with outlining (The inliner usies the size increase to model the<br>
+ // the cost of inlining a callee).<br>
+ if (!SkipCostAnalysis && OutlinedRegionSizeCost < SizeCost) {<br>
+ OptimizationRemarkEmitter ORE(F);<br>
+ DebugLoc DLoc;<br>
+ BasicBlock *Block;<br>
+ std::tie(DLoc, Block) = getOneDebugLoc(DuplicateFunction);<br>
+ ORE.emit(OptimizationRemarkAnalysis(DEBUG_TYPE, "OutlineRegionTooSmall",<br>
+ DLoc, Block)<br>
+ << ore::NV("Function", F)<br>
+ << " not partially inlined into callers (Original Size = "<br>
+ << ore::NV("OutlinedRegionOriginalSize", OutlinedRegionSizeCost)<br>
+ << ", Size of call sequence to outlined function = "<br>
+ << ore::NV("NewSize", SizeCost) << ")");<br>
+ return false;<br>
+ }<br>
+<br>
+ assert(F->user_begin() == F->user_end() &&<br>
+ "F's users should all be replaced!");<br>
std::vector<User *> Users(DuplicateFunction->user_begin(),<br>
DuplicateFunction->user_end());<br>
<br>
+ DenseMap<User *, uint64_t> CallSiteToProfCountMap;<br>
+ if (F->getEntryCount())<br>
+ computeCallsiteToProfCountMap(DuplicateFunction, CallSiteToProfCountMap);<br>
+<br>
+ auto CalleeEntryCount = F->getEntryCount();<br>
+ uint64_t CalleeEntryCountV = (CalleeEntryCount ? *CalleeEntryCount : 0);<br>
+ bool AnyInline = false;<br>
for (User *User : Users) {<br>
- CallSite CS;<br>
- if (CallInst *CI = dyn_cast<CallInst>(User))<br>
- CS = CallSite(CI);<br>
- else if (InvokeInst *II = dyn_cast<InvokeInst>(User))<br>
- CS = CallSite(II);<br>
- else<br>
- llvm_unreachable("All uses must be calls");<br>
+ CallSite CS = getCallSite(User);<br>
<br>
if (IsLimitReached())<br>
continue;<br>
<br>
OptimizationRemarkEmitter ORE(CS.getCaller());<br>
- if (!shouldPartialInline(CS, ORE))<br>
+<br>
+ if (!shouldPartialInline(CS, F, OI, CalleeBFI, OutliningCallBB,<br>
+ NonWeightedRcost, ORE))<br>
continue;<br>
<br>
- DebugLoc DLoc = CS.getInstruction()->getDebugLoc();<br>
- BasicBlock *Block = CS.getParent();<br>
- ORE.emit(OptimizationRemark(DEBUG_TYPE, "PartiallyInlined", DLoc, Block)<br>
- << ore::NV("Callee", F) << " partially inlined into "<br>
- << ore::NV("Caller", CS.getCaller()));<br>
+ ORE.emit(<br>
+ OptimizationRemark(DEBUG_TYPE, "PartiallyInlined", CS.getInstruction())<br>
+ << ore::NV("Callee", F) << " partially inlined into "<br>
+ << ore::NV("Caller", CS.getCaller()));<br>
<br>
InlineFunctionInfo IFI(nullptr, GetAssumptionCache, PSI);<br>
InlineFunction(CS, IFI);<br>
+<br>
+ // Now update the entry count:<br>
+ if (CalleeEntryCountV && CallSiteToProfCountMap.count(User)) {<br>
+ uint64_t CallSiteCount = CallSiteToProfCountMap[User];<br>
+ CalleeEntryCountV -= std::min(CalleeEntryCountV, CallSiteCount);<br>
+ }<br>
+<br>
+ AnyInline = true;<br>
NumPartialInlining++;<br>
- // update stats<br>
+ // Update the stats<br>
NumPartialInlined++;<br>
}<br>
<br>
- // Ditch the duplicate, since we're done with it, and rewrite all remaining<br>
- // users (function pointers, etc.) back to the original function.<br>
- DuplicateFunction->replaceAllUsesWith(F);<br>
- DuplicateFunction->eraseFromParent();<br>
-<br>
+ if (AnyInline && CalleeEntryCount)<br>
+ F->setEntryCount(CalleeEntryCountV);<br>
<br>
- return ExtractedFunction;<br>
+ return AnyInline;<br>
}<br>
<br>
bool PartialInlinerImpl::run(Module &M) {<br>
<br>
Modified: llvm/trunk/test/Transforms/CodeExtractor/ExtractedFnEntryCount.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/ExtractedFnEntryCount.ll?rev=302967&r1=302966&r2=302967&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/ExtractedFnEntryCount.ll?rev=302967&r1=302966&r2=302967&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/Transforms/CodeExtractor/ExtractedFnEntryCount.ll (original)<br>
+++ llvm/trunk/test/Transforms/CodeExtractor/ExtractedFnEntryCount.ll Fri May 12 18:41:43 2017<br>
@@ -1,4 +1,4 @@<br>
-; RUN: opt < %s -partial-inliner -S | FileCheck %s<br>
+; RUN: opt < %s -partial-inliner -skip-partial-inlining-cost-analysis -S | FileCheck %s<br>
<br>
; This test checks to make sure that the CodeExtractor<br>
; properly sets the entry count for the function that is<br>
<br>
Modified: llvm/trunk/test/Transforms/CodeExtractor/MultipleExitBranchProb.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/MultipleExitBranchProb.ll?rev=302967&r1=302966&r2=302967&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/MultipleExitBranchProb.ll?rev=302967&r1=302966&r2=302967&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/Transforms/CodeExtractor/MultipleExitBranchProb.ll (original)<br>
+++ llvm/trunk/test/Transforms/CodeExtractor/MultipleExitBranchProb.ll Fri May 12 18:41:43 2017<br>
@@ -1,4 +1,4 @@<br>
-; RUN: opt < %s -partial-inliner -max-num-inline-blocks=2 -S | FileCheck %s<br>
+; RUN: opt < %s -partial-inliner -max-num-inline-blocks=2 -skip-partial-inlining-cost-analysis -S | FileCheck %s<br>
<br>
; This test checks to make sure that CodeExtractor updates<br>
; the exit branch probabilities for multiple exit blocks.<br>
<br>
Modified: llvm/trunk/test/Transforms/CodeExtractor/PartialInlineAnd.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/PartialInlineAnd.ll?rev=302967&r1=302966&r2=302967&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/PartialInlineAnd.ll?rev=302967&r1=302966&r2=302967&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/Transforms/CodeExtractor/PartialInlineAnd.ll (original)<br>
+++ llvm/trunk/test/Transforms/CodeExtractor/PartialInlineAnd.ll Fri May 12 18:41:43 2017<br>
@@ -1,7 +1,7 @@<br>
; RUN: opt < %s -partial-inliner -S | FileCheck %s<br>
; RUN: opt < %s -passes=partial-inliner -S | FileCheck %s<br>
-; RUN: opt < %s -partial-inliner -max-num-inline-blocks=2 -S | FileCheck --check-prefix=LIMIT %s<br>
-; RUN: opt < %s -passes=partial-inliner -max-num-inline-blocks=2 -S | FileCheck --check-prefix=LIMIT %s<br>
+; RUN: opt < %s -partial-inliner -skip-partial-inlining-cost-analysis -max-num-inline-blocks=2 -S | FileCheck --check-prefix=LIMIT %s<br>
+; RUN: opt < %s -passes=partial-inliner -skip-partial-inlining-cost-analysis -max-num-inline-blocks=2 -S | FileCheck --check-prefix=LIMIT %s<br>
<br>
; Function Attrs: nounwind uwtable<br>
define i32 @bar(i32 %arg) local_unnamed_addr #0 {<br>
<br>
Added: llvm/trunk/test/Transforms/CodeExtractor/PartialInlineEntryUpdate.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/PartialInlineEntryUpdate.ll?rev=302967&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/PartialInlineEntryUpdate.ll?rev=302967&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/Transforms/CodeExtractor/PartialInlineEntryUpdate.ll (added)<br>
+++ llvm/trunk/test/Transforms/CodeExtractor/PartialInlineEntryUpdate.ll Fri May 12 18:41:43 2017<br>
@@ -0,0 +1,41 @@<br>
+; RUN: opt < %s -skip-partial-inlining-cost-analysis -partial-inliner -S | FileCheck %s<br>
+; RUN: opt < %s -skip-partial-inlining-cost-analysis -passes=partial-inliner -S | FileCheck %s<br>
+<br>
+define i32 @Func(i1 %cond, i32* align 4 %align.val) !prof !1 {<br>
+; CHECK: @Func({{.*}}) !prof [[REMAINCOUNT:![0-9]+]]<br>
+entry:<br>
+ br i1 %cond, label %if.then, label %return<br>
+if.then:<br>
+ ; Dummy store to have more than 0 uses<br>
+ store i32 10, i32* %align.val, align 4<br>
+ br label %return<br>
+return: ; preds = %entry<br>
+ ret i32 0<br>
+}<br>
+<br>
+define internal i32 @Caller1(i1 %cond, i32* align 2 %align.val) !prof !3{<br>
+entry:<br>
+; CHECK-LABEL: @Caller1<br>
+; CHECK: br<br>
+; CHECK: call void @Func.1_<br>
+; CHECK: br<br>
+; CHECK: call void @Func.1_<br>
+ %val = call i32 @Func(i1 %cond, i32* %align.val)<br>
+ %val2 = call i32 @Func(i1 %cond, i32* %align.val)<br>
+ ret i32 %val<br>
+}<br>
+<br>
+define internal i32 @Caller2(i1 %cond, i32* align 2 %align.val) !prof !2{<br>
+entry:<br>
+; CHECK-LABEL: @Caller2<br>
+; CHECK: br<br>
+; CHECK: call void @Func.1_<br>
+ %val = call i32 @Func(i1 %cond, i32* %align.val)<br>
+ ret i32 %val<br>
+}<br>
+<br>
+; CHECK: [[REMAINCOUNT]] = !{!"function_entry_count", i64 150}<br>
+!1 = !{!"function_entry_count", i64 200}<br>
+!2 = !{!"function_entry_count", i64 10}<br>
+!3 = !{!"function_entry_count", i64 20}<br>
+<br>
<br>
Added: llvm/trunk/test/Transforms/CodeExtractor/PartialInlineHighCost.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/PartialInlineHighCost.ll?rev=302967&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/PartialInlineHighCost.ll?rev=302967&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/Transforms/CodeExtractor/PartialInlineHighCost.ll (added)<br>
+++ llvm/trunk/test/Transforms/CodeExtractor/PartialInlineHighCost.ll Fri May 12 18:41:43 2017<br>
@@ -0,0 +1,107 @@<br>
+; The outlined region has high frequency and the outlining<br>
+; call sequence is expensive (input, output, multiple exit etc)<br>
+; RUN: opt < %s -partial-inliner -max-num-inline-blocks=2 -S | FileCheck %s<br>
+; RUN: opt < %s -passes=partial-inliner -max-num-inline-blocks=2 -S | FileCheck %s<br>
+; RUN: opt < %s -partial-inliner -skip-partial-inlining-cost-analysis -max-num-inline-blocks=2 -S | FileCheck --check-prefix=NOCOST %s<br>
+; RUN: opt < %s -passes=partial-inliner -skip-partial-inlining-cost-analysis -max-num-inline-blocks=2 -S | FileCheck --check-prefix=NOCOST %s<br>
+<br>
+<br>
+; Function Attrs: nounwind<br>
+define i32 @bar_hot_outline_region(i32 %arg) local_unnamed_addr #0 {<br>
+bb:<br>
+ %tmp = icmp slt i32 %arg, 0<br>
+ br i1 %tmp, label %bb1, label %bb16, !prof !1<br>
+<br>
+bb1: ; preds = %bb<br>
+ %tmp2 = tail call i32 (...) @foo() #0<br>
+ %tmp3 = tail call i32 (...) @foo() #0<br>
+ %tmp4 = tail call i32 (...) @foo() #0<br>
+ %tmp5 = tail call i32 (...) @foo() #0<br>
+ %tmp6 = tail call i32 (...) @foo() #0<br>
+ %tmp7 = tail call i32 (...) @foo() #0<br>
+ %tmp8 = add nsw i32 %arg, 1<br>
+ %tmp9 = tail call i32 @goo(i32 %tmp8) #0<br>
+ %tmp10 = tail call i32 (...) @foo() #0<br>
+ %tmp11 = icmp eq i32 %tmp10, 0<br>
+ br i1 %tmp11, label %bb12, label %bb16<br>
+<br>
+bb12: ; preds = %bb1<br>
+ %tmp13 = tail call i32 (...) @foo() #0<br>
+ %tmp14 = icmp eq i32 %tmp13, 0<br>
+ %tmp15 = select i1 %tmp14, i32 0, i32 3<br>
+ br label %bb16<br>
+<br>
+bb16: ; preds = %bb12, %bb1, %bb<br>
+ %tmp17 = phi i32 [ 2, %bb1 ], [ %tmp15, %bb12 ], [ 0, %bb ]<br>
+ ret i32 %tmp17<br>
+}<br>
+<br>
+define i32 @bar_cold_outline_region(i32 %arg) local_unnamed_addr #0 {<br>
+bb:<br>
+ %tmp = icmp slt i32 %arg, 0<br>
+ br i1 %tmp, label %bb1, label %bb16, !prof !2<br>
+<br>
+bb1: ; preds = %bb<br>
+ %tmp2 = tail call i32 (...) @foo() #0<br>
+ %tmp3 = tail call i32 (...) @foo() #0<br>
+ %tmp4 = tail call i32 (...) @foo() #0<br>
+ %tmp5 = tail call i32 (...) @foo() #0<br>
+ %tmp6 = tail call i32 (...) @foo() #0<br>
+ %tmp7 = tail call i32 (...) @foo() #0<br>
+ %tmp8 = add nsw i32 %arg, 1<br>
+ %tmp9 = tail call i32 @goo(i32 %tmp8) #0<br>
+ %tmp10 = tail call i32 (...) @foo() #0<br>
+ %tmp11 = icmp eq i32 %tmp10, 0<br>
+ br i1 %tmp11, label %bb12, label %bb16<br>
+<br>
+bb12: ; preds = %bb1<br>
+ %tmp13 = tail call i32 (...) @foo() #0<br>
+ %tmp14 = icmp eq i32 %tmp13, 0<br>
+ %tmp15 = select i1 %tmp14, i32 0, i32 3<br>
+ br label %bb16<br>
+<br>
+bb16: ; preds = %bb12, %bb1, %bb<br>
+ %tmp17 = phi i32 [ 2, %bb1 ], [ %tmp15, %bb12 ], [ 0, %bb ]<br>
+ ret i32 %tmp17<br>
+}<br>
+<br>
+; Function Attrs: nounwind<br>
+declare i32 @foo(...) local_unnamed_addr #0<br>
+<br>
+; Function Attrs: nounwind<br>
+declare i32 @goo(i32) local_unnamed_addr #0<br>
+<br>
+; Function Attrs: nounwind<br>
+define i32 @dummy_caller(i32 %arg) local_unnamed_addr #0 {<br>
+bb:<br>
+; CHECK-LABEL: @dummy_caller<br>
+; CHECK-NOT: br i1<br>
+; CHECK-NOT: call{{.*}}bar_hot_outline_region.<br>
+; NOCOST-LABEL: @dummy_caller<br>
+; NOCOST: br i1<br>
+; NOCOST: call{{.*}}bar_hot_outline_region.<br>
+<br>
+ %tmp = tail call i32 @bar_hot_outline_region(i32 %arg)<br>
+ ret i32 %tmp<br>
+}<br>
+<br>
+define i32 @dummy_caller2(i32 %arg) local_unnamed_addr #0 {<br>
+bb:<br>
+; CHECK-LABEL: @dummy_caller2<br>
+; CHECK: br i1<br>
+; CHECK: call{{.*}}bar_cold_outline_region.<br>
+; NOCOST-LABEL: @dummy_caller2<br>
+; NOCOST: br i1<br>
+; NOCOST: call{{.*}}bar_cold_outline_region.<br>
+<br>
+ %tmp = tail call i32 @bar_cold_outline_region(i32 %arg)<br>
+ ret i32 %tmp<br>
+}<br>
+<br>
+attributes #0 = { nounwind }<br>
+<br>
+!llvm.ident = !{!0}<br>
+<br>
+!0 = !{!"clang version 5.0.0 (trunk 301898)"}<br>
+!1 = !{!"branch_weights", i32 2000, i32 1}<br>
+!2 = !{!"branch_weights", i32 1, i32 100}<br>
<br>
Modified: llvm/trunk/test/Transforms/CodeExtractor/PartialInlineOr.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/PartialInlineOr.ll?rev=302967&r1=302966&r2=302967&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/PartialInlineOr.ll?rev=302967&r1=302966&r2=302967&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/Transforms/CodeExtractor/PartialInlineOr.ll (original)<br>
+++ llvm/trunk/test/Transforms/CodeExtractor/PartialInlineOr.ll Fri May 12 18:41:43 2017<br>
@@ -1,5 +1,5 @@<br>
-; RUN: opt < %s -partial-inliner -S | FileCheck %s<br>
-; RUN: opt < %s -passes=partial-inliner -S | FileCheck %s<br>
+; RUN: opt < %s -partial-inliner -skip-partial-inlining-cost-analysis -S | FileCheck %s<br>
+; RUN: opt < %s -passes=partial-inliner -skip-partial-inlining-cost-analysis -S | FileCheck %s<br>
; RUN: opt < %s -partial-inliner -max-num-inline-blocks=2 -S | FileCheck --check-prefix=LIMIT %s<br>
; RUN: opt < %s -passes=partial-inliner -max-num-inline-blocks=2 -S | FileCheck --check-prefix=LIMIT %s<br>
<br>
<br>
Modified: llvm/trunk/test/Transforms/CodeExtractor/PartialInlineOrAnd.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/PartialInlineOrAnd.ll?rev=302967&r1=302966&r2=302967&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/PartialInlineOrAnd.ll?rev=302967&r1=302966&r2=302967&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/Transforms/CodeExtractor/PartialInlineOrAnd.ll (original)<br>
+++ llvm/trunk/test/Transforms/CodeExtractor/PartialInlineOrAnd.ll Fri May 12 18:41:43 2017<br>
@@ -1,7 +1,7 @@<br>
; RUN: opt < %s -partial-inliner -S | FileCheck %s<br>
; RUN: opt < %s -passes=partial-inliner -S | FileCheck %s<br>
-; RUN: opt < %s -partial-inliner -max-num-inline-blocks=3 -S | FileCheck --check-prefix=LIMIT3 %s<br>
-; RUN: opt < %s -passes=partial-inliner -max-num-inline-blocks=3 -S | FileCheck --check-prefix=LIMIT3 %s<br>
+; RUN: opt < %s -partial-inliner -max-num-inline-blocks=3 -skip-partial-inlining-cost-analysis -S | FileCheck --check-prefix=LIMIT3 %s<br>
+; RUN: opt < %s -passes=partial-inliner -max-num-inline-blocks=3 -skip-partial-inlining-cost-analysis -S | FileCheck --check-prefix=LIMIT3 %s<br>
; RUN: opt < %s -partial-inliner -max-num-inline-blocks=2 -S | FileCheck --check-prefix=LIMIT2 %s<br>
; RUN: opt < %s -passes=partial-inliner -max-num-inline-blocks=2 -S | FileCheck --check-prefix=LIMIT2 %s<br>
<br>
<br>
Modified: llvm/trunk/test/Transforms/CodeExtractor/SingleCondition.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/SingleCondition.ll?rev=302967&r1=302966&r2=302967&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/SingleCondition.ll?rev=302967&r1=302966&r2=302967&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/Transforms/CodeExtractor/SingleCondition.ll (original)<br>
+++ llvm/trunk/test/Transforms/CodeExtractor/SingleCondition.ll Fri May 12 18:41:43 2017<br>
@@ -1,5 +1,5 @@<br>
-; RUN: opt < %s -partial-inliner -S | FileCheck %s<br>
-; RUN: opt < %s -passes=partial-inliner -S | FileCheck %s<br>
+; RUN: opt < %s -skip-partial-inlining-cost-analysis -partial-inliner -S | FileCheck %s<br>
+; RUN: opt < %s -skip-partial-inlining-cost-analysis -passes=partial-inliner -S | FileCheck %s<br>
<br>
define internal i32 @inlinedFunc(i1 %cond, i32* align 4 %align.val) {<br>
entry:<br>
<br>
Modified: llvm/trunk/test/Transforms/CodeExtractor/X86/InheritTargetAttributes.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/X86/InheritTargetAttributes.ll?rev=302967&r1=302966&r2=302967&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeExtractor/X86/InheritTargetAttributes.ll?rev=302967&r1=302966&r2=302967&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/Transforms/CodeExtractor/X86/InheritTargetAttributes.ll (original)<br>
+++ llvm/trunk/test/Transforms/CodeExtractor/X86/InheritTargetAttributes.ll Fri May 12 18:41:43 2017<br>
@@ -1,5 +1,5 @@<br>
-; RUN: opt < %s -partial-inliner | llc -filetype=null<br>
-; RUN: opt < %s -partial-inliner -S | FileCheck %s<br>
+; RUN: opt < %s -partial-inliner -skip-partial-inlining-cost-analysis | llc -filetype=null<br>
+; RUN: opt < %s -partial-inliner -skip-partial-inlining-cost-analysis -S | FileCheck %s<br>
; This testcase checks to see if CodeExtractor properly inherits<br>
; target specific attributes for the extracted function. This can<br>
; cause certain instructions that depend on the attributes to not<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
</blockquote></div>
</blockquote></div><br></div>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
</blockquote></div>