<div dir="ltr">Hi Zaara, it looks like this change <a href="http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/12709">introduced a memory leak</a>.  Could you please take a look?<div><span style="color:black;font-family:"Courier New",courier,monotype,monospace;font-size:medium"><br></span></div><div><span style="color:black;font-family:"Courier New",courier,monotype,monospace;font-size:medium">==59925==ERROR: LeakSanitizer: detected memory leaks</span></div><div><pre style="font-family:"Courier New",courier,monotype,monospace;color:rgb(0,0,0);font-size:medium;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial"><span class="gmail-stdout" style="font-family:"Courier New",courier,monotype,monospace;color:black">Direct leak of 128 byte(s) in 1 object(s) allocated from:
    #0 0xbe1518 in operator new(unsigned long) /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:92
    #1 0x41d988a in llvm::Pass* llvm::callDefaultCtor<llvm::DominatorTreeWrapperPass>() /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/PassSupport.h:77:63
    #2 0x42e61aa in createPass /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/PassInfo.h:102:12
    #3 0x42e61aa in llvm::PMDataManager::add(llvm::Pass*, bool) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1032
    #4 0x42dc5a8 in llvm::PMTopLevelManager::schedulePass(llvm::Pass*) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:700:6
    #5 0x45a6b1d in llvm::PassManagerBuilder::populateModulePassManager(llvm::legacy::PassManagerBase&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp:542:9
    #6 0x5758356 in CreatePasses /b/sanitizer-x86_64-linux-fast/build/llvm/tools/clang/lib/CodeGen/BackendUtil.cpp:660:13
    #7 0x5758356 in EmitAssembly /b/sanitizer-x86_64-linux-fast/build/llvm/tools/clang/lib/CodeGen/BackendUtil.cpp:756
    #8 0x5758356 in clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream> >) /b/sanitizer-x86_64-linux-fast/build/llvm/tools/clang/lib/CodeGen/BackendUtil.cpp:1191
    #9 0x6fe0a22 in clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) /b/sanitizer-x86_64-linux-fast/build/llvm/tools/clang/lib/CodeGen/CodeGenAction.cpp:292:7
    #10 0x7fac56b in clang::ParseAST(clang::Sema&, bool, bool) /b/sanitizer-x86_64-linux-fast/build/llvm/tools/clang/lib/Parse/ParseAST.cpp:159:13
    #11 0x6fdc7c3 in clang::CodeGenAction::ExecuteAction() /b/sanitizer-x86_64-linux-fast/build/llvm/tools/clang/lib/CodeGen/CodeGenAction.cpp:1031:28
    #12 0x64a9b9d in clang::FrontendAction::Execute() /b/sanitizer-x86_64-linux-fast/build/llvm/tools/clang/lib/Frontend/FrontendAction.cpp:904:8
    #13 0x63a8fa4 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-x86_64-linux-fast/build/llvm/tools/clang/lib/Frontend/CompilerInstance.cpp:991:11
    #14 0x6694cdd in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /b/sanitizer-x86_64-linux-fast/build/llvm/tools/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:252:25
    #15 0xbf46d5 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /b/sanitizer-x86_64-linux-fast/build/llvm/tools/clang/tools/driver/cc1_main.cpp:221:13
    #16 0xbed060 in ExecuteCC1Tool /b/sanitizer-x86_64-linux-fast/build/llvm/tools/clang/tools/driver/driver.cpp:309:12
    #17 0xbed060 in main /b/sanitizer-x86_64-linux-fast/build/llvm/tools/clang/tools/driver/driver.cpp:389
    #18 0x7f87441a02b0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202b0)</span></pre><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jan 17, 2018 at 10:22 AM, Zaara Syeda via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: syzaara<br>
Date: Wed Jan 17 10:22:55 2018<br>
New Revision: 322721<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=322721&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project?rev=322721&view=rev</a><br>
Log:<br>
[PowerPC] Add handling for ColdCC calling convention and a pass to mark<br>
candidates with coldcc attribute.<br>
<br>
This patch adds support for the coldcc calling convention for Power.<br>
This changes the set of non-volatile registers. It includes a pass to stress<br>
test the implementation by marking all static directly called functions with<br>
the coldcc attribute through the option -enable-coldcc-stress-test. It also<br>
includes an option, -ppc-enable-coldcc, to add the coldcc attribute to<br>
functions which are cold at all call sites based on BlockFrequencyInfo when<br>
the containing function does not call any non cold functions.<br>
<br>
Differential Revision: <a href="https://reviews.llvm.org/D38413" rel="noreferrer" target="_blank">https://reviews.llvm.org/<wbr>D38413</a><br>
<br>
Added:<br>
    llvm/trunk/test/CodeGen/<wbr>PowerPC/coldcc.ll<br>
    llvm/trunk/test/CodeGen/<wbr>PowerPC/coldcc2.ll<br>
    llvm/trunk/test/Transforms/<wbr>GlobalOpt/PowerPC/<br>
    llvm/trunk/test/Transforms/<wbr>GlobalOpt/PowerPC/coldcc_<wbr>coldsites.ll<br>
    llvm/trunk/test/Transforms/<wbr>GlobalOpt/PowerPC/lit.local.<wbr>cfg<br>
    llvm/trunk/test/Transforms/<wbr>GlobalOpt/coldcc_stress_test.<wbr>ll<br>
Modified:<br>
    llvm/trunk/include/llvm/<wbr>Analysis/TargetTransformInfo.h<br>
    llvm/trunk/include/llvm/<wbr>Analysis/<wbr>TargetTransformInfoImpl.h<br>
    llvm/trunk/lib/Analysis/<wbr>TargetTransformInfo.cpp<br>
    llvm/trunk/lib/Target/PowerPC/<wbr>PPCCallingConv.td<br>
    llvm/trunk/lib/Target/PowerPC/<wbr>PPCFastISel.cpp<br>
    llvm/trunk/lib/Target/PowerPC/<wbr>PPCFrameLowering.cpp<br>
    llvm/trunk/lib/Target/PowerPC/<wbr>PPCISelLowering.cpp<br>
    llvm/trunk/lib/Target/PowerPC/<wbr>PPCRegisterInfo.cpp<br>
    llvm/trunk/lib/Target/PowerPC/<wbr>PPCTargetTransformInfo.cpp<br>
    llvm/trunk/lib/Target/PowerPC/<wbr>PPCTargetTransformInfo.h<br>
    llvm/trunk/lib/Transforms/IPO/<wbr>GlobalOpt.cpp<br>
    llvm/trunk/test/Other/pass-<wbr>pipelines.ll<br>
<br>
Modified: llvm/trunk/include/llvm/<wbr>Analysis/TargetTransformInfo.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=322721&r1=322720&r2=322721&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/include/<wbr>llvm/Analysis/<wbr>TargetTransformInfo.h?rev=<wbr>322721&r1=322720&r2=322721&<wbr>view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/include/llvm/<wbr>Analysis/TargetTransformInfo.h (original)<br>
+++ llvm/trunk/include/llvm/<wbr>Analysis/TargetTransformInfo.h Wed Jan 17 10:22:55 2018<br>
@@ -541,6 +541,10 @@ public:<br>
   /// containing this constant value for the target.<br>
   bool shouldBuildLookupTablesForCons<wbr>tant(Constant *C) const;<br>
<br>
+  /// \brief Return true if the input function which is cold at all call sites,<br>
+  ///  should use coldcc calling convention.<br>
+  bool useColdCCForColdCall(Function &F) const;<br>
+<br>
   unsigned getScalarizationOverhead(Type *Ty, bool Insert, bool Extract) const;<br>
<br>
   unsigned getOperandsScalarizationOverhe<wbr>ad(ArrayRef<const Value *> Args,<br>
@@ -992,6 +996,7 @@ public:<br>
   virtual unsigned getJumpBufSize() = 0;<br>
   virtual bool shouldBuildLookupTables() = 0;<br>
   virtual bool shouldBuildLookupTablesForCons<wbr>tant(Constant *C) = 0;<br>
+  virtual bool useColdCCForColdCall(Function &F) = 0;<br>
   virtual unsigned<br>
   getScalarizationOverhead(Type *Ty, bool Insert, bool Extract) = 0;<br>
   virtual unsigned getOperandsScalarizationOverhe<wbr>ad(ArrayRef<const Value *> Args,<br>
@@ -1237,6 +1242,10 @@ public:<br>
   bool shouldBuildLookupTablesForCons<wbr>tant(Constant *C) override {<br>
     return Impl.<wbr>shouldBuildLookupTablesForCons<wbr>tant(C);<br>
   }<br>
+  bool useColdCCForColdCall(Function &F) override {<br>
+    return Impl.useColdCCForColdCall(F);<br>
+  }<br>
+<br>
   unsigned getScalarizationOverhead(Type *Ty, bool Insert,<br>
                                     bool Extract) override {<br>
     return Impl.getScalarizationOverhead(<wbr>Ty, Insert, Extract);<br>
<br>
Modified: llvm/trunk/include/llvm/<wbr>Analysis/<wbr>TargetTransformInfoImpl.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h?rev=322721&r1=322720&r2=322721&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/include/<wbr>llvm/Analysis/<wbr>TargetTransformInfoImpl.h?rev=<wbr>322721&r1=322720&r2=322721&<wbr>view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/include/llvm/<wbr>Analysis/<wbr>TargetTransformInfoImpl.h (original)<br>
+++ llvm/trunk/include/llvm/<wbr>Analysis/<wbr>TargetTransformInfoImpl.h Wed Jan 17 10:22:55 2018<br>
@@ -284,6 +284,8 @@ public:<br>
   bool shouldBuildLookupTables() { return true; }<br>
   bool shouldBuildLookupTablesForCons<wbr>tant(Constant *C) { return true; }<br>
<br>
+  bool useColdCCForColdCall(Function &F) { return false; }<br>
+<br>
   unsigned getScalarizationOverhead(Type *Ty, bool Insert, bool Extract) {<br>
     return 0;<br>
   }<br>
<br>
Modified: llvm/trunk/lib/Analysis/<wbr>TargetTransformInfo.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/TargetTransformInfo.cpp?rev=322721&r1=322720&r2=322721&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/<wbr>Analysis/TargetTransformInfo.<wbr>cpp?rev=322721&r1=322720&r2=<wbr>322721&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Analysis/<wbr>TargetTransformInfo.cpp (original)<br>
+++ llvm/trunk/lib/Analysis/<wbr>TargetTransformInfo.cpp Wed Jan 17 10:22:55 2018<br>
@@ -226,6 +226,10 @@ bool TargetTransformInfo::<wbr>shouldBuildLoo<br>
   return TTIImpl-><wbr>shouldBuildLookupTablesForCons<wbr>tant(C);<br>
 }<br>
<br>
+bool TargetTransformInfo::<wbr>useColdCCForColdCall(Function &F) const {<br>
+  return TTIImpl->useColdCCForColdCall(<wbr>F);<br>
+}<br>
+<br>
 unsigned TargetTransformInfo::<br>
 getScalarizationOverhead(Type *Ty, bool Insert, bool Extract) const {<br>
   return TTIImpl-><wbr>getScalarizationOverhead(Ty, Insert, Extract);<br>
<br>
Modified: llvm/trunk/lib/Target/PowerPC/<wbr>PPCCallingConv.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCCallingConv.td?rev=322721&r1=322720&r2=322721&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>PowerPC/PPCCallingConv.td?rev=<wbr>322721&r1=322720&r2=322721&<wbr>view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/PowerPC/<wbr>PPCCallingConv.td (original)<br>
+++ llvm/trunk/lib/Target/PowerPC/<wbr>PPCCallingConv.td Wed Jan 17 10:22:55 2018<br>
@@ -45,6 +45,29 @@ def RetCC_PPC64_AnyReg : CallingConv<[<br>
   CCCustom<"CC_PPC_AnyReg_Error"<wbr>><br>
 ]>;<br>
<br>
+// Return-value convention for PowerPC coldcc.<br>
+def RetCC_PPC_Cold : CallingConv<[<br>
+  // Use the same return registers as RetCC_PPC, but limited to only<br>
+  // one return value. The remaining return values will be saved to<br>
+  // the stack.<br>
+  CCIfType<[i32, i1], CCIfSubtarget<"isPPC64()", CCPromoteToType<i64>>>,<br>
+  CCIfType<[i1], CCIfNotSubtarget<"isPPC64()", CCPromoteToType<i32>>>,<br>
+<br>
+  CCIfType<[i32], CCAssignToReg<[R3]>>,<br>
+  CCIfType<[i64], CCAssignToReg<[X3]>>,<br>
+  CCIfType<[i128], CCAssignToReg<[X3]>>,<br>
+<br>
+  CCIfType<[f32], CCAssignToReg<[F1]>>,<br>
+  CCIfType<[f64], CCAssignToReg<[F1]>>,<br>
+<br>
+  CCIfType<[v4f64, v4f32, v4i1],<br>
+           CCIfSubtarget<"hasQPX()", CCAssignToReg<[QF1]>>>,<br>
+<br>
+  CCIfType<[v16i8, v8i16, v4i32, v2i64, v1i128, v4f32, v2f64],<br>
+           CCIfSubtarget<"hasAltivec()",<br>
+           CCAssignToReg<[V2]>>><br>
+]>;<br>
+<br>
 // Return-value convention for PowerPC<br>
 def RetCC_PPC : CallingConv<[<br>
   CCIfCC<"CallingConv::AnyReg", CCDelegateTo<RetCC_PPC64_<wbr>AnyReg>>,<br>
@@ -271,6 +294,36 @@ def CSR_SVR464_R2_Altivec_ViaCopy : Call<br>
<br>
 def CSR_NoRegs : CalleeSavedRegs<(add)>;<br>
<br>
+// coldcc calling convection marks most registers as non-volatile.<br>
+// Do not include r1 since the stack pointer is never considered a CSR.<br>
+// Do not include r2, since it is the TOC register and is added depending<br>
+// on wether or not the function uses the TOC and is a non-leaf.<br>
+// Do not include r0,r11,r13 as they are optional in functional linkage<br>
+// and value may be altered by inter-library calls.<br>
+// Do not include r12 as it is used as a scratch register.<br>
+// Do not include return registers r3, f1, v2.<br>
+def CSR_SVR32_ColdCC : CalleeSavedRegs<(add (sequence "R%u", 4, 10),<br>
+                                          (sequence "R%u", 14, 31),<br>
+                                          F0, (sequence "F%u", 2, 31),<br>
+                                          (sequence "CR%u", 0, 7))>;<br>
+<br>
+def CSR_SVR32_ColdCC_Altivec : CalleeSavedRegs<(add CSR_SVR32_ColdCC,<br>
+                                            (sequence "V%u", 0, 1),<br>
+                                            (sequence "V%u", 3, 31))>;<br>
+<br>
+def CSR_SVR64_ColdCC : CalleeSavedRegs<(add  (sequence "X%u", 4, 10),<br>
+                                             (sequence "X%u", 14, 31),<br>
+                                             F0, (sequence "F%u", 2, 31),<br>
+                                             (sequence "CR%u", 0, 7))>;<br>
+<br>
+def CSR_SVR64_ColdCC_R2: CalleeSavedRegs<(add CSR_SVR64_ColdCC, X2)>;<br>
+<br>
+def CSR_SVR64_ColdCC_Altivec : CalleeSavedRegs<(add CSR_SVR64_ColdCC,<br>
+                                             (sequence "V%u", 0, 1),<br>
+                                             (sequence "V%u", 3, 31))>;<br>
+<br>
+def CSR_SVR64_ColdCC_R2_Altivec : CalleeSavedRegs<(add CSR_SVR64_ColdCC_Altivec, X2)>;<br>
+<br>
 def CSR_64_AllRegs: CalleeSavedRegs<(add X0, (sequence "X%u", 3, 10),<br>
                                              (sequence "X%u", 14, 31),<br>
                                              (sequence "F%u", 0, 31),<br>
<br>
Modified: llvm/trunk/lib/Target/PowerPC/<wbr>PPCFastISel.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCFastISel.cpp?rev=322721&r1=322720&r2=322721&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>PowerPC/PPCFastISel.cpp?rev=<wbr>322721&r1=322720&r2=322721&<wbr>view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/PowerPC/<wbr>PPCFastISel.cpp (original)<br>
+++ llvm/trunk/lib/Target/PowerPC/<wbr>PPCFastISel.cpp Wed Jan 17 10:22:55 2018<br>
@@ -206,6 +206,8 @@ CCAssignFn *PPCFastISel::usePPC32CCs(uns<br>
     return CC_PPC32_SVR4_ByVal;<br>
   else if (Flag == 3)<br>
     return CC_PPC32_SVR4_VarArg;<br>
+  else if (Flag == 4)<br>
+    return RetCC_PPC_Cold;<br>
   else<br>
     return RetCC_PPC;<br>
 }<br>
<br>
Modified: llvm/trunk/lib/Target/PowerPC/<wbr>PPCFrameLowering.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCFrameLowering.cpp?rev=322721&r1=322720&r2=322721&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>PowerPC/PPCFrameLowering.cpp?<wbr>rev=322721&r1=322720&r2=<wbr>322721&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/PowerPC/<wbr>PPCFrameLowering.cpp (original)<br>
+++ llvm/trunk/lib/Target/PowerPC/<wbr>PPCFrameLowering.cpp Wed Jan 17 10:22:55 2018<br>
@@ -1950,7 +1950,14 @@ PPCFrameLowering::<wbr>spillCalleeSavedRegist<br>
     bool IsCRField = PPC::CR2 <= Reg && Reg <= PPC::CR4;<br>
<br>
     // Add the callee-saved register as live-in; it's killed at the spill.<br>
-    MBB.addLiveIn(Reg);<br>
+    // Do not do this for callee-saved registers that are live-in to the<br>
+    // function because they will already be marked live-in and this will be<br>
+    // adding it for a second time. It is an error to add the same register<br>
+    // to the set more than once.<br>
+    const MachineRegisterInfo &MRI = MF->getRegInfo();<br>
+    bool IsLiveIn = MRI.isLiveIn(Reg);<br>
+    if (!IsLiveIn)<br>
+       MBB.addLiveIn(Reg);<br>
<br>
     if (CRSpilled && IsCRField) {<br>
       CRMIB.addReg(Reg, RegState::ImplicitKill);<br>
@@ -1980,7 +1987,10 @@ PPCFrameLowering::<wbr>spillCalleeSavedRegist<br>
       }<br>
     } else {<br>
       const TargetRegisterClass *RC = TRI->getMinimalPhysRegClass(<wbr>Reg);<br>
-      TII.storeRegToStackSlot(MBB, MI, Reg, true,<br>
+      // Use !IsLiveIn for the kill flag.<br>
+      // We do not want to kill registers that are live in this function<br>
+      // before their use because they will become undefined registers.<br>
+      TII.storeRegToStackSlot(MBB, MI, Reg, !IsLiveIn,<br>
                               CSI[i].getFrameIdx(), RC, TRI);<br>
     }<br>
   }<br>
<br>
Modified: llvm/trunk/lib/Target/PowerPC/<wbr>PPCISelLowering.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp?rev=322721&r1=322720&r2=322721&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>PowerPC/PPCISelLowering.cpp?<wbr>rev=322721&r1=322720&r2=<wbr>322721&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/PowerPC/<wbr>PPCISelLowering.cpp (original)<br>
+++ llvm/trunk/lib/Target/PowerPC/<wbr>PPCISelLowering.cpp Wed Jan 17 10:22:55 2018<br>
@@ -4939,7 +4939,11 @@ SDValue PPCTargetLowering::<wbr>LowerCallResu<br>
   SmallVector<CCValAssign, 16> RVLocs;<br>
   CCState CCRetInfo(CallConv, isVarArg, DAG.getMachineFunction(), RVLocs,<br>
                     *DAG.getContext());<br>
-  CCRetInfo.AnalyzeCallResult(<wbr>Ins, RetCC_PPC);<br>
+<br>
+  CCRetInfo.AnalyzeCallResult(<br>
+      Ins, (Subtarget.isSVR4ABI() && CallConv == CallingConv::Cold)<br>
+               ? RetCC_PPC_Cold<br>
+               : RetCC_PPC);<br>
<br>
   // Copy all of the result registers out of their specified physreg.<br>
   for (unsigned i = 0, e = RVLocs.size(); i != e; ++i) {<br>
@@ -5159,6 +5163,7 @@ SDValue PPCTargetLowering::LowerCall_<wbr>32S<br>
   // of the 32-bit SVR4 ABI stack frame layout.<br>
<br>
   assert((CallConv == CallingConv::C ||<br>
+          CallConv == CallingConv::Cold ||<br>
           CallConv == CallingConv::Fast) && "Unknown calling convention!");<br>
<br>
   unsigned PtrByteSize = 4;<br>
@@ -6420,7 +6425,10 @@ PPCTargetLowering::<wbr>CanLowerReturn(Callin<br>
                                   LLVMContext &Context) const {<br>
   SmallVector<CCValAssign, 16> RVLocs;<br>
   CCState CCInfo(CallConv, isVarArg, MF, RVLocs, Context);<br>
-  return CCInfo.CheckReturn(Outs, RetCC_PPC);<br>
+  return CCInfo.CheckReturn(<br>
+      Outs, (Subtarget.isSVR4ABI() && CallConv == CallingConv::Cold)<br>
+                ? RetCC_PPC_Cold<br>
+                : RetCC_PPC);<br>
 }<br>
<br>
 SDValue<br>
@@ -6432,7 +6440,10 @@ PPCTargetLowering::<wbr>LowerReturn(SDValue C<br>
   SmallVector<CCValAssign, 16> RVLocs;<br>
   CCState CCInfo(CallConv, isVarArg, DAG.getMachineFunction(), RVLocs,<br>
                  *DAG.getContext());<br>
-  CCInfo.AnalyzeReturn(Outs, RetCC_PPC);<br>
+  CCInfo.AnalyzeReturn(Outs,<br>
+                       (Subtarget.isSVR4ABI() && CallConv == CallingConv::Cold)<br>
+                           ? RetCC_PPC_Cold<br>
+                           : RetCC_PPC);<br>
<br>
   SDValue Flag;<br>
   SmallVector<SDValue, 4> RetOps(1, Chain);<br>
<br>
Modified: llvm/trunk/lib/Target/PowerPC/<wbr>PPCRegisterInfo.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp?rev=322721&r1=322720&r2=322721&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>PowerPC/PPCRegisterInfo.cpp?<wbr>rev=322721&r1=322720&r2=<wbr>322721&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/PowerPC/<wbr>PPCRegisterInfo.cpp (original)<br>
+++ llvm/trunk/lib/Target/PowerPC/<wbr>PPCRegisterInfo.cpp Wed Jan 17 10:22:55 2018<br>
@@ -144,6 +144,17 @@ PPCRegisterInfo::<wbr>getCalleeSavedRegs(cons<br>
   // On PPC64, we might need to save r2 (but only if it is not reserved).<br>
   bool SaveR2 = MF->getRegInfo().<wbr>isAllocatable(PPC::X2);<br>
<br>
+  if (MF->getFunction().<wbr>getCallingConv() == CallingConv::Cold) {<br>
+    return TM.isPPC64()<br>
+               ? (Subtarget.hasAltivec()<br>
+                      ? (SaveR2 ? CSR_SVR64_ColdCC_R2_Altivec_<wbr>SaveList<br>
+                                : CSR_SVR64_ColdCC_Altivec_<wbr>SaveList)<br>
+                      : (SaveR2 ? CSR_SVR64_ColdCC_R2_SaveList<br>
+                                : CSR_SVR64_ColdCC_SaveList))<br>
+               : (Subtarget.hasAltivec() ? CSR_SVR32_ColdCC_Altivec_<wbr>SaveList<br>
+                                         : CSR_SVR32_ColdCC_SaveList);<br>
+  }<br>
+<br>
   return TM.isPPC64()<br>
              ? (Subtarget.hasAltivec()<br>
                     ? (SaveR2 ? CSR_SVR464_R2_Altivec_SaveList<br>
@@ -196,6 +207,13 @@ PPCRegisterInfo::<wbr>getCallPreservedMask(co<br>
                         : (Subtarget.hasAltivec() ? CSR_Darwin32_Altivec_RegMask<br>
                                                   : CSR_Darwin32_RegMask);<br>
<br>
+  if (CC == CallingConv::Cold) {<br>
+    return TM.isPPC64() ? (Subtarget.hasAltivec() ? CSR_SVR64_ColdCC_Altivec_<wbr>RegMask<br>
+                                                  : CSR_SVR64_ColdCC_RegMask)<br>
+                        : (Subtarget.hasAltivec() ? CSR_SVR32_ColdCC_Altivec_<wbr>RegMask<br>
+                                                  : CSR_SVR32_ColdCC_RegMask);<br>
+  }<br>
+<br>
   return TM.isPPC64() ? (Subtarget.hasAltivec() ? CSR_SVR464_Altivec_RegMask<br>
                                                 : CSR_SVR464_RegMask)<br>
                       : (Subtarget.hasAltivec() ? CSR_SVR432_Altivec_RegMask<br>
<br>
Modified: llvm/trunk/lib/Target/PowerPC/<wbr>PPCTargetTransformInfo.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp?rev=322721&r1=322720&r2=322721&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>PowerPC/<wbr>PPCTargetTransformInfo.cpp?<wbr>rev=322721&r1=322720&r2=<wbr>322721&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/PowerPC/<wbr>PPCTargetTransformInfo.cpp (original)<br>
+++ llvm/trunk/lib/Target/PowerPC/<wbr>PPCTargetTransformInfo.cpp Wed Jan 17 10:22:55 2018<br>
@@ -27,6 +27,11 @@ static cl::opt<unsigned><br>
 CacheLineSize("ppc-loop-<wbr>prefetch-cache-line", cl::Hidden, cl::init(64),<br>
               cl::desc("The loop prefetch cache line size"));<br>
<br>
+static cl::opt<bool><br>
+EnablePPCColdCC("ppc-enable-<wbr>coldcc", cl::Hidden, cl::init(false),<br>
+                cl::desc("Enable using coldcc calling conv for cold "<br>
+                         "internal functions"));<br>
+<br>
 //===-------------------------<wbr>------------------------------<wbr>---------------===//<br>
 //<br>
 // PPC cost model.<br>
@@ -215,6 +220,14 @@ void PPCTTIImpl::<wbr>getUnrollingPreferences<br>
   BaseT::<wbr>getUnrollingPreferences(L, SE, UP);<br>
 }<br>
<br>
+// This function returns true to allow using coldcc calling convention.<br>
+// Returning true results in coldcc being used for functions which are cold at<br>
+// all call sites when the callers of the functions are not calling any other<br>
+// non coldcc functions.<br>
+bool PPCTTIImpl::<wbr>useColdCCForColdCall(Function &F) {<br>
+  return EnablePPCColdCC;<br>
+}<br>
+<br>
 bool PPCTTIImpl::<wbr>enableAggressiveInterleaving(<wbr>bool LoopHasReductions) {<br>
   // On the A2, always unroll aggressively. For QPX unaligned loads, we depend<br>
   // on combining the loads generated for consecutive accesses, and failure to<br>
<br>
Modified: llvm/trunk/lib/Target/PowerPC/<wbr>PPCTargetTransformInfo.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h?rev=322721&r1=322720&r2=322721&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>PowerPC/<wbr>PPCTargetTransformInfo.h?rev=<wbr>322721&r1=322720&r2=322721&<wbr>view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/PowerPC/<wbr>PPCTargetTransformInfo.h (original)<br>
+++ llvm/trunk/lib/Target/PowerPC/<wbr>PPCTargetTransformInfo.h Wed Jan 17 10:22:55 2018<br>
@@ -61,7 +61,7 @@ public:<br>
<br>
   /// \name Vector TTI Implementations<br>
   /// @{<br>
-<br>
+  bool useColdCCForColdCall(Function &F);<br>
   bool enableAggressiveInterleaving(<wbr>bool LoopHasReductions);<br>
   const TTI::MemCmpExpansionOptions *enableMemCmpExpansion(<br>
       bool IsZeroCmp) const;<br>
<br>
Modified: llvm/trunk/lib/Transforms/IPO/<wbr>GlobalOpt.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp?rev=322721&r1=322720&r2=322721&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/<wbr>Transforms/IPO/GlobalOpt.cpp?<wbr>rev=322721&r1=322720&r2=<wbr>322721&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Transforms/IPO/<wbr>GlobalOpt.cpp (original)<br>
+++ llvm/trunk/lib/Transforms/IPO/<wbr>GlobalOpt.cpp Wed Jan 17 10:22:55 2018<br>
@@ -22,9 +22,11 @@<br>
 #include "llvm/ADT/Statistic.h"<br>
 #include "llvm/ADT/Twine.h"<br>
 #include "llvm/ADT/iterator_range.h"<br>
+#include "llvm/Analysis/<wbr>BlockFrequencyInfo.h"<br>
 #include "llvm/Analysis/<wbr>ConstantFolding.h"<br>
 #include "llvm/Analysis/MemoryBuiltins.<wbr>h"<br>
 #include "llvm/Analysis/<wbr>TargetLibraryInfo.h"<br>
+#include "llvm/Analysis/<wbr>TargetTransformInfo.h"<br>
 #include "llvm/BinaryFormat/Dwarf.h"<br>
 #include "llvm/IR/Attributes.h"<br>
 #include "llvm/IR/BasicBlock.h"<br>
@@ -55,6 +57,7 @@<br>
 #include "llvm/Pass.h"<br>
 #include "llvm/Support/AtomicOrdering.<wbr>h"<br>
 #include "llvm/Support/Casting.h"<br>
+#include "llvm/Support/CommandLine.h"<br>
 #include "llvm/Support/Debug.h"<br>
 #include "llvm/Support/ErrorHandling.h"<br>
 #include "llvm/Support/MathExtras.h"<br>
@@ -88,6 +91,21 @@ STATISTIC(NumNestRemoved   , "Number of<br>
 STATISTIC(NumAliasesResolved, "Number of global aliases resolved");<br>
 STATISTIC(NumAliasesRemoved, "Number of global aliases eliminated");<br>
 STATISTIC(NumCXXDtorsRemoved, "Number of global C++ destructors removed");<br>
+STATISTIC(NumInternalFunc, "Number of internal functions");<br>
+STATISTIC(NumColdCC, "Number of functions marked coldcc");<br>
+<br>
+static cl::opt<bool><br>
+    EnableColdCCStressTest("<wbr>enable-coldcc-stress-test",<br>
+                           cl::desc("Enable stress test of coldcc by adding "<br>
+                                    "calling conv to all internal functions."),<br>
+                           cl::init(false), cl::Hidden);<br>
+<br>
+static cl::opt<int> ColdCCRelFreq(<br>
+    "coldcc-rel-freq", cl::Hidden, cl::init(2), cl::ZeroOrMore,<br>
+    cl::desc(<br>
+        "Maximum block frequency, expressed as a percentage of caller's "<br>
+        "entry frequency, for a call site to be considered cold for enabling"<br>
+        "coldcc"));<br>
<br>
 /// Is this global variable possibly used by a leak checker as a root?  If so,<br>
 /// we might not really want to eliminate the stores to it.<br>
@@ -2097,20 +2115,114 @@ static void RemoveNestAttribute(Function<br>
 /// idea here is that we don't want to mess with the convention if the user<br>
 /// explicitly requested something with performance implications like coldcc,<br>
 /// GHC, or anyregcc.<br>
-static bool isProfitableToMakeFastCC(<wbr>Function *F) {<br>
+static bool hasChangeableCC(Function *F) {<br>
   CallingConv::ID CC = F->getCallingConv();<br>
   // FIXME: Is it worth transforming x86_stdcallcc and x86_fastcallcc?<br>
   return CC == CallingConv::C || CC == CallingConv::X86_ThisCall;<br>
 }<br>
<br>
+/// Return true if the block containing the call site has a BlockFrequency of<br>
+/// less than ColdCCRelFreq% of the entry block.<br>
+static bool isColdCallSite(CallSite CS, BlockFrequencyInfo &CallerBFI) {<br>
+  const BranchProbability ColdProb(ColdCCRelFreq, 100);<br>
+  auto CallSiteBB = CS.getInstruction()-><wbr>getParent();<br>
+  auto CallSiteFreq = CallerBFI.getBlockFreq(<wbr>CallSiteBB);<br>
+  auto CallerEntryFreq =<br>
+      CallerBFI.getBlockFreq(&(CS.<wbr>getCaller()->getEntryBlock()))<wbr>;<br>
+  return CallSiteFreq < CallerEntryFreq * ColdProb;<br>
+}<br>
+<br>
+// This function checks if the input function F is cold at all call sites. It<br>
+// also looks each call site's containing function, returning false if the<br>
+// caller function contains other non cold calls. The input vector AllCallsCold<br>
+// contains a list of functions that only have call sites in cold blocks.<br>
+static bool<br>
+isValidCandidateForColdCC(<wbr>Function &F,<br>
+                          function_ref<<wbr>BlockFrequencyInfo &(Function &)> GetBFI,<br>
+                          const std::vector<Function *> &AllCallsCold) {<br>
+<br>
+  if (F.user_empty())<br>
+    return false;<br>
+<br>
+  for (User *U : F.users()) {<br>
+    if (isa<BlockAddress>(U))<br>
+      continue;<br>
+<br>
+    CallSite CS(cast<Instruction>(U));<br>
+    Function *CallerFunc = CS.getInstruction()-><wbr>getParent()->getParent();<br>
+    BlockFrequencyInfo &CallerBFI = GetBFI(*CallerFunc);<br>
+    if (!isColdCallSite(CS, CallerBFI))<br>
+      return false;<br>
+    auto It = std::find(AllCallsCold.begin()<wbr>, AllCallsCold.end(), CallerFunc);<br>
+    if (It == AllCallsCold.end())<br>
+      return false;<br>
+  }<br>
+  return true;<br>
+}<br>
+<br>
+static void changeCallSitesToColdCC(<wbr>Function *F) {<br>
+  for (User *U : F->users()) {<br>
+    if (isa<BlockAddress>(U))<br>
+      continue;<br>
+    CallSite CS(cast<Instruction>(U));<br>
+    CS.setCallingConv(CallingConv:<wbr>:Cold);<br>
+  }<br>
+}<br>
+<br>
+// This function iterates over all the call instructions in the input Function<br>
+// and checks that all call sites are in cold blocks and are allowed to use the<br>
+// coldcc calling convention.<br>
+static bool<br>
+hasOnlyColdCalls(Function &F,<br>
+                 function_ref<<wbr>BlockFrequencyInfo &(Function &)> GetBFI) {<br>
+  for (BasicBlock &BB : F) {<br>
+    for (Instruction &I : BB) {<br>
+      if (CallInst *CI = dyn_cast<CallInst>(&I)) {<br>
+        CallSite CS(cast<Instruction>(CI));<br>
+        // Skip over isline asm instructions since they aren't function calls.<br>
+        if (CI->isInlineAsm())<br>
+          continue;<br>
+        Function *CalledFn = CI->getCalledFunction();<br>
+        if (!CalledFn)<br>
+          return false;<br>
+        if (!CalledFn->hasLocalLinkage())<br>
+          return false;<br>
+        // Skip over instrinsics since they won't remain as function calls.<br>
+        if (CalledFn->getIntrinsicID() != Intrinsic::not_intrinsic)<br>
+          continue;<br>
+        // Check if it's valid to use coldcc calling convention.<br>
+        if (!hasChangeableCC(CalledFn) || CalledFn->isVarArg() ||<br>
+            CalledFn->hasAddressTaken())<br>
+          return false;<br>
+        BlockFrequencyInfo &CallerBFI = GetBFI(F);<br>
+        if (!isColdCallSite(CS, CallerBFI))<br>
+          return false;<br>
+      }<br>
+    }<br>
+  }<br>
+  return true;<br>
+}<br>
+<br>
 static bool<br>
 OptimizeFunctions(Module &M, TargetLibraryInfo *TLI,<br>
+                  function_ref<<wbr>TargetTransformInfo &(Function &)> GetTTI,<br>
+                  function_ref<<wbr>BlockFrequencyInfo &(Function &)> GetBFI,<br>
                   function_ref<DominatorTree &(Function &)> LookupDomTree,<br>
                   SmallSet<const Comdat *, 8> &NotDiscardableComdats) {<br>
+<br>
   bool Changed = false;<br>
+<br>
+  std::vector<Function *> AllCallsCold;<br>
+  for (Module::iterator FI = M.begin(), E = M.end(); FI != E;) {<br>
+    Function *F = &*FI++;<br>
+    if (hasOnlyColdCalls(*F, GetBFI))<br>
+      AllCallsCold.push_back(F);<br>
+  }<br>
+<br>
   // Optimize functions.<br>
   for (Module::iterator FI = M.begin(), E = M.end(); FI != E; ) {<br>
     Function *F = &*FI++;<br>
+<br>
     // Functions without names cannot be referenced outside this module.<br>
     if (!F->hasName() && !F->isDeclaration() && !F->hasLocalLinkage())<br>
       F->setLinkage(GlobalValue::<wbr>InternalLinkage);<br>
@@ -2142,7 +2254,25 @@ OptimizeFunctions(Module &M, TargetLibra<br>
<br>
     if (!F->hasLocalLinkage())<br>
       continue;<br>
-    if (isProfitableToMakeFastCC(F) && !F->isVarArg() &&<br>
+<br>
+    if (hasChangeableCC(F) && !F->isVarArg() && !F->hasAddressTaken()) {<br>
+      NumInternalFunc++;<br>
+      TargetTransformInfo &TTI = GetTTI(*F);<br>
+      // Change the calling convention to coldcc if either stress testing is<br>
+      // enabled or the target would like to use coldcc on functions which are<br>
+      // cold at all call sites and the callers contain no other non coldcc<br>
+      // calls.<br>
+      if (EnableColdCCStressTest ||<br>
+          (isValidCandidateForColdCC(*F, GetBFI, AllCallsCold) &&<br>
+           TTI.useColdCCForColdCall(*F))) {<br>
+        F->setCallingConv(CallingConv:<wbr>:Cold);<br>
+        changeCallSitesToColdCC(F);<br>
+        Changed = true;<br>
+        NumColdCC++;<br>
+      }<br>
+    }<br>
+<br>
+    if (hasChangeableCC(F) && !F->isVarArg() &&<br>
         !F->hasAddressTaken()) {<br>
       // If this function has a calling convention worth changing, is not a<br>
       // varargs function, and is only called directly, promote it to use the<br>
@@ -2620,6 +2750,8 @@ static bool OptimizeEmptyGlobalCXXDtors(<br>
<br>
 static bool optimizeGlobalsInModule(<br>
     Module &M, const DataLayout &DL, TargetLibraryInfo *TLI,<br>
+    function_ref<<wbr>TargetTransformInfo &(Function &)> GetTTI,<br>
+    function_ref<<wbr>BlockFrequencyInfo &(Function &)> GetBFI,<br>
     function_ref<DominatorTree &(Function &)> LookupDomTree) {<br>
   SmallSet<const Comdat *, 8> NotDiscardableComdats;<br>
   bool Changed = false;<br>
@@ -2642,8 +2774,8 @@ static bool optimizeGlobalsInModule(<br>
           NotDiscardableComdats.insert(<wbr>C);<br>
<br>
     // Delete functions that are trivially dead, ccc -> fastcc<br>
-    LocalChange |=<br>
-        OptimizeFunctions(M, TLI, LookupDomTree, NotDiscardableComdats);<br>
+    LocalChange |= OptimizeFunctions(M, TLI, GetTTI, GetBFI, LookupDomTree,<br>
+                                     NotDiscardableComdats);<br>
<br>
     // Optimize global_ctors list.<br>
     LocalChange |= optimizeGlobalCtorsList(M, [&](Function *F) {<br>
@@ -2680,7 +2812,15 @@ PreservedAnalyses GlobalOptPass::run(Mod<br>
     auto LookupDomTree = [&FAM](Function &F) -> DominatorTree &{<br>
       return FAM.getResult<<wbr>DominatorTreeAnalysis>(F);<br>
     };<br>
-    if (!optimizeGlobalsInModule(M, DL, &TLI, LookupDomTree))<br>
+    auto GetTTI = [&FAM](Function &F) -> TargetTransformInfo & {<br>
+      return FAM.getResult<<wbr>TargetIRAnalysis>(F);<br>
+    };<br>
+<br>
+    auto GetBFI = [&FAM](Function &F) -> BlockFrequencyInfo & {<br>
+      return FAM.getResult<<wbr>BlockFrequencyAnalysis>(F);<br>
+    };<br>
+<br>
+    if (!optimizeGlobalsInModule(M, DL, &TLI, GetTTI, GetBFI, LookupDomTree))<br>
       return PreservedAnalyses::all();<br>
     return PreservedAnalyses::none();<br>
 }<br>
@@ -2703,11 +2843,21 @@ struct GlobalOptLegacyPass : public Modu<br>
     auto LookupDomTree = [this](Function &F) -> DominatorTree & {<br>
       return this->getAnalysis<<wbr>DominatorTreeWrapperPass>(F).<wbr>getDomTree();<br>
     };<br>
-    return optimizeGlobalsInModule(M, DL, TLI, LookupDomTree);<br>
+    auto GetTTI = [this](Function &F) -> TargetTransformInfo & {<br>
+      return this->getAnalysis<<wbr>TargetTransformInfoWrapperPass<wbr>>().getTTI(F);<br>
+    };<br>
+<br>
+    auto GetBFI = [this](Function &F) -> BlockFrequencyInfo & {<br>
+      return this->getAnalysis<<wbr>BlockFrequencyInfoWrapperPass><wbr>(F).getBFI();<br>
+    };<br>
+<br>
+    return optimizeGlobalsInModule(M, DL, TLI, GetTTI, GetBFI, LookupDomTree);<br>
   }<br>
<br>
   void getAnalysisUsage(AnalysisUsage &AU) const override {<br>
     AU.addRequired<<wbr>TargetLibraryInfoWrapperPass>(<wbr>);<br>
+    AU.addRequired<<wbr>TargetTransformInfoWrapperPass<wbr>>();<br>
+    AU.addRequired<<wbr>BlockFrequencyInfoWrapperPass><wbr>();<br>
     AU.addRequired<<wbr>DominatorTreeWrapperPass>();<br>
   }<br>
 };<br>
@@ -2719,6 +2869,8 @@ char GlobalOptLegacyPass::ID = 0;<br>
 INITIALIZE_PASS_BEGIN(<wbr>GlobalOptLegacyPass, "globalopt",<br>
                       "Global Variable Optimizer", false, false)<br>
 INITIALIZE_PASS_DEPENDENCY(<wbr>TargetLibraryInfoWrapperPass)<br>
+INITIALIZE_PASS_DEPENDENCY(<wbr>TargetTransformInfoWrapperPass<wbr>)<br>
+INITIALIZE_PASS_DEPENDENCY(<wbr>BlockFrequencyInfoWrapperPass)<br>
 INITIALIZE_PASS_DEPENDENCY(<wbr>DominatorTreeWrapperPass)<br>
 INITIALIZE_PASS_END(<wbr>GlobalOptLegacyPass, "globalopt",<br>
                     "Global Variable Optimizer", false, false)<br>
<br>
Added: llvm/trunk/test/CodeGen/<wbr>PowerPC/coldcc.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/coldcc.ll?rev=322721&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/test/<wbr>CodeGen/PowerPC/coldcc.ll?rev=<wbr>322721&view=auto</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/test/CodeGen/<wbr>PowerPC/coldcc.ll (added)<br>
+++ llvm/trunk/test/CodeGen/<wbr>PowerPC/coldcc.ll Wed Jan 17 10:22:55 2018<br>
@@ -0,0 +1,46 @@<br>
+; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-<wbr>linux-gnu  < %s | FileCheck %s -check-prefix=COLDCC<br>
+<br>
+define signext i32 @caller(i32 signext %a, i32 signext %b, i32 signext %cold) {<br>
+entry:<br>
+  %0 = tail call i32 asm "add $0, $1, $2", "=r,r,r,~{r14},~{r15},~{r16},~<wbr>{r17},~{r18},~{r19},~{r20},~{<wbr>r21},~{r22},~{r23},~{r24},~{<wbr>r25},~{r26},~{r27},~{r28},~{<wbr>r29},~{r30},~{r31}"(i32 %a, i32 %b)<br>
+  %mul = mul nsw i32 %0, %cold<br>
+  %tobool = icmp eq i32 %cold, 0<br>
+  br i1 %tobool, label %if.end, label %if.then<br>
+<br>
+if.then:                                          ; preds = %entry<br>
+  %mul1 = mul nsw i32 %mul, %cold<br>
+  %mul2 = mul nsw i32 %b, %a<br>
+  %call = tail call coldcc signext i32 @callee(i32 signext %a, i32 signext %b)<br>
+  %add = add i32 %mul2, %a<br>
+  %add3 = add i32 %add, %mul<br>
+  %add4 = add i32 %add3, %mul1<br>
+  %add5 = add i32 %add4, %call<br>
+  br label %if.end<br>
+<br>
+if.end:                                           ; preds = %entry, %if.then<br>
+  %f.0 = phi i32 [ %add5, %if.then ], [ %0, %entry ]<br>
+  ret i32 %f.0<br>
+}<br>
+<br>
+define internal coldcc signext i32 @callee(i32 signext %a, i32 signext %b) local_unnamed_addr #0 {<br>
+entry:<br>
+; COLDCC: @callee<br>
+; COLDCC: std 6, -8(1)<br>
+; COLDCC: std 7, -16(1)<br>
+; COLDCC: std 8, -24(1)<br>
+; COLDCC: std 9, -32(1)<br>
+; COLDCC: std 10, -40(1)<br>
+; COLDCC: ld 9, -32(1)<br>
+; COLDCC: ld 8, -24(1)<br>
+; COLDCC: ld 7, -16(1)<br>
+; COLDCC: ld 10, -40(1)<br>
+; COLDCC: ld 6, -8(1)<br>
+  %0 = tail call i32 asm "add $0, $1, $2", "=r,r,r,~{r6},~{r7},~{r8},~{<wbr>r9},~{r10}"(i32 %a, i32 %b)<br>
+  %mul = mul nsw i32 %a, 3<br>
+  %1 = mul i32 %b, -5<br>
+  %add = add i32 %1, %mul<br>
+  %sub = add i32 %add, %0<br>
+  ret i32 %sub<br>
+}<br>
+<br>
+attributes #0 = { noinline }<br>
<br>
Added: llvm/trunk/test/CodeGen/<wbr>PowerPC/coldcc2.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/coldcc2.ll?rev=322721&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/test/<wbr>CodeGen/PowerPC/coldcc2.ll?<wbr>rev=322721&view=auto</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/test/CodeGen/<wbr>PowerPC/coldcc2.ll (added)<br>
+++ llvm/trunk/test/CodeGen/<wbr>PowerPC/coldcc2.ll Wed Jan 17 10:22:55 2018<br>
@@ -0,0 +1,42 @@<br>
+; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-<wbr>linux-gnu < %s | FileCheck %s -check-prefix=COLDCC<br>
+<br>
+%struct.MyStruct = type { i32, i32, i32, i32 }<br>
+<br>
+@caller.s = internal unnamed_addr global %struct.MyStruct zeroinitializer, align 8<br>
+<br>
+define signext i32 @caller(i32 signext %a, i32 signext %b, i32 signext %cold) {<br>
+entry:<br>
+; COLDCC: bl callee<br>
+; COLDCC: ld 4, 40(1)<br>
+; COLDCC: ld 5, 32(1)<br>
+  %call = tail call coldcc { i64, i64 } @callee(i32 signext %a, i32 signext %b)<br>
+  %0 = extractvalue { i64, i64 } %call, 0<br>
+  %1 = extractvalue { i64, i64 } %call, 1<br>
+  store i64 %0, i64* bitcast (%struct.MyStruct* @caller.s to i64*), align 8<br>
+  store i64 %1, i64* bitcast (i32* getelementptr inbounds (%struct.MyStruct, %struct.MyStruct* @caller.s, i64 0, i32 2) to i64*), align 8<br>
+  %2 = lshr i64 %1, 32<br>
+  %3 = trunc i64 %2 to i32<br>
+  %sub = sub nsw i32 0, %3<br>
+  ret i32 %sub<br>
+}<br>
+<br>
+define internal coldcc { i64, i64 } @callee(i32 signext %a, i32 signext %b) {<br>
+entry:<br>
+; COLDCC: std {{[0-9]+}}, 0(3)<br>
+; COLDCC: std {{[0-9]+}}, 8(3)<br>
+  %0 = tail call i32 asm "add $0, $1, $2", "=r,r,r,~{r6},~{r7},~{r8},~{<wbr>r9},~{r10}"(i32 %a, i32 %b)<br>
+  %mul = mul nsw i32 %a, 3<br>
+  %1 = mul i32 %b, -5<br>
+  %add = add i32 %1, %mul<br>
+  %sub = add i32 %add, %0<br>
+  %mul5 = mul nsw i32 %b, %a<br>
+  %add6 = add nsw i32 %sub, %mul5<br>
+  %retval.sroa.0.0.insert.ext = zext i32 %0 to i64<br>
+  %retval.sroa.3.8.insert.ext = zext i32 %sub to i64<br>
+  %retval.sroa.3.12.insert.ext = zext i32 %add6 to i64<br>
+  %retval.sroa.3.12.insert.shift = shl nuw i64 %retval.sroa.3.12.insert.ext, 32<br>
+  %retval.sroa.3.12.insert.<wbr>insert = or i64 %retval.sroa.3.12.insert.<wbr>shift, %retval.sroa.3.8.insert.ext<br>
+  %.fca.0.insert = insertvalue { i64, i64 } undef, i64 %retval.sroa.0.0.insert.ext, 0<br>
+  %.fca.1.insert = insertvalue { i64, i64 } %.fca.0.insert, i64 %retval.sroa.3.12.insert.<wbr>insert, 1<br>
+  ret { i64, i64 } %.fca.1.insert<br>
+}<br>
<br>
Modified: llvm/trunk/test/Other/pass-<wbr>pipelines.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Other/pass-pipelines.ll?rev=322721&r1=322720&r2=322721&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/test/Other/<wbr>pass-pipelines.ll?rev=322721&<wbr>r1=322720&r2=322721&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/test/Other/pass-<wbr>pipelines.ll (original)<br>
+++ llvm/trunk/test/Other/pass-<wbr>pipelines.ll Wed Jan 17 10:22:55 2018<br>
@@ -93,7 +93,7 @@<br>
 ; FIXME: There really shouldn't be another pass manager, especially one that<br>
 ; just builds the domtree. It doesn't even run the verifier.<br>
 ; CHECK-O2: Pass Arguments:<br>
-; CHECK-O2-NEXT: FunctionPass Manager<br>
+; CHECK-O2: FunctionPass Manager<br>
 ; CHECK-O2-NEXT: Dominator Tree Construction<br>
<br>
 define void @foo() {<br>
<br>
Added: llvm/trunk/test/Transforms/<wbr>GlobalOpt/PowerPC/coldcc_<wbr>coldsites.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GlobalOpt/PowerPC/coldcc_coldsites.ll?rev=322721&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/test/<wbr>Transforms/GlobalOpt/PowerPC/<wbr>coldcc_coldsites.ll?rev=<wbr>322721&view=auto</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/test/Transforms/<wbr>GlobalOpt/PowerPC/coldcc_<wbr>coldsites.ll (added)<br>
+++ llvm/trunk/test/Transforms/<wbr>GlobalOpt/PowerPC/coldcc_<wbr>coldsites.ll Wed Jan 17 10:22:55 2018<br>
@@ -0,0 +1,81 @@<br>
+; RUN: opt -globalopt -mtriple=powerpc64le-unknown-<wbr>linux-gnu -ppc-enable-coldcc -S < %s | FileCheck %s -check-prefix=COLDCC<br>
+; RUN: opt -globalopt -S < %s | FileCheck %s -check-prefix=CHECK<br>
+<br>
+define signext i32 @caller(i32 signext %a, i32 signext %b, i32 signext %lim, i32 signext %i) local_unnamed_addr #0 !prof !30 {<br>
+entry:<br>
+; COLDCC: call coldcc signext i32 @callee<br>
+; CHECK: call fastcc signext i32 @callee<br>
+  %add = add nsw i32 %b, %a<br>
+  %sub = add nsw i32 %lim, -1<br>
+  %cmp = icmp eq i32 %sub, %i<br>
+  br i1 %cmp, label %if.then, label %if.end, !prof !31<br>
+<br>
+if.then:                                          ; preds = %entry<br>
+  %call = tail call signext i32 @callee(i32 signext %a, i32 signext %b)<br>
+  br label %if.end<br>
+<br>
+if.end:                                           ; preds = %if.then, %entry<br>
+  %f.0 = phi i32 [ %call, %if.then ], [ %add, %entry ]<br>
+  ret i32 %f.0<br>
+}<br>
+<br>
+define internal signext i32 @callee(i32 signext %a, i32 signext %b) unnamed_addr #0 {<br>
+entry:<br>
+  %0 = tail call i32 asm "add $0, $1, $2", "=r,r,r,~{r6},~{r7},~{r8},~{<wbr>r9}"(i32 %a, i32 %b) #1, !srcloc !32<br>
+  %mul = mul nsw i32 %a, 3<br>
+  %mul1 = shl i32 %0, 1<br>
+  %add = add nsw i32 %mul1, %mul<br>
+  ret i32 %add<br>
+}<br>
+<br>
+define signext i32 @main() local_unnamed_addr #0 !prof !33 {<br>
+entry:<br>
+  br label %for.body<br>
+<br>
+for.cond.cleanup:                                 ; preds = %for.body<br>
+  %add.lcssa = phi i32 [ %add, %for.body ]<br>
+  ret i32 %add.lcssa<br>
+<br>
+for.body:                                         ; preds = %for.body, %entry<br>
+  %i.011 = phi i32 [ 0, %entry ], [ %inc, %for.body ]<br>
+  %ret.010 = phi i32 [ 0, %entry ], [ %add, %for.body ]<br>
+  %call = tail call signext i32 @caller(i32 signext 4, i32 signext 5, i32 signext 10000000, i32 signext %i.011)<br>
+  %add = add nsw i32 %call, %ret.010<br>
+  %inc = add nuw nsw i32 %i.011, 1<br>
+  %exitcond = icmp eq i32 %inc, 10000000<br>
+  br i1 %exitcond, label %for.cond.cleanup, label %for.body, !prof !34<br>
+}<br>
+attributes #0 = { noinline }<br>
+<br>
+!0 = !{i32 1, !"ProfileSummary", !1}<br>
+!1 = !{!2, !3, !4, !5, !6, !7, !8, !9}<br>
+!2 = !{!"ProfileFormat", !"InstrProf"}<br>
+!3 = !{!"TotalCount", i64 20000003}<br>
+!4 = !{!"MaxCount", i64 10000000}<br>
+!5 = !{!"MaxInternalCount", i64 10000000}<br>
+!6 = !{!"MaxFunctionCount", i64 10000000}<br>
+!7 = !{!"NumCounts", i64 5}<br>
+!8 = !{!"NumFunctions", i64 3}<br>
+!9 = !{!"DetailedSummary", !10}<br>
+!10 = !{!11, !12, !13, !14, !15, !16, !16, !17, !17, !18, !19, !20, !21, !22, !23, !24, !25, !26}<br>
+!11 = !{i32 10000, i64 10000000, i32 2}<br>
+!12 = !{i32 100000, i64 10000000, i32 2}<br>
+!13 = !{i32 200000, i64 10000000, i32 2}<br>
+!14 = !{i32 300000, i64 10000000, i32 2}<br>
+!15 = !{i32 400000, i64 10000000, i32 2}<br>
+!16 = !{i32 500000, i64 10000000, i32 2}<br>
+!17 = !{i32 600000, i64 10000000, i32 2}<br>
+!18 = !{i32 700000, i64 10000000, i32 2}<br>
+!19 = !{i32 800000, i64 10000000, i32 2}<br>
+!20 = !{i32 900000, i64 10000000, i32 2}<br>
+!21 = !{i32 950000, i64 10000000, i32 2}<br>
+!22 = !{i32 990000, i64 10000000, i32 2}<br>
+!23 = !{i32 999000, i64 10000000, i32 2}<br>
+!24 = !{i32 999900, i64 10000000, i32 2}<br>
+!25 = !{i32 999990, i64 10000000, i32 2}<br>
+!26 = !{i32 999999, i64 10000000, i32 2}<br>
+!30 = !{!"function_entry_count", i64 10000000}<br>
+!31 = !{!"branch_weights", i32 2, i32 10000000}<br>
+!32 = !{i32 59}<br>
+!33 = !{!"function_entry_count", i64 1}<br>
+!34 = !{!"branch_weights", i32 2, i32 10000001}<br>
<br>
Added: llvm/trunk/test/Transforms/<wbr>GlobalOpt/PowerPC/lit.local.<wbr>cfg<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GlobalOpt/PowerPC/lit.local.cfg?rev=322721&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/test/<wbr>Transforms/GlobalOpt/PowerPC/<wbr>lit.local.cfg?rev=322721&view=<wbr>auto</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/test/Transforms/<wbr>GlobalOpt/PowerPC/lit.local.<wbr>cfg (added)<br>
+++ llvm/trunk/test/Transforms/<wbr>GlobalOpt/PowerPC/lit.local.<wbr>cfg Wed Jan 17 10:22:55 2018<br>
@@ -0,0 +1,3 @@<br>
+if not 'PowerPC' in config.root.targets:<br>
+    config.unsupported = True<br>
+<br>
<br>
Added: llvm/trunk/test/Transforms/<wbr>GlobalOpt/coldcc_stress_test.<wbr>ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GlobalOpt/coldcc_stress_test.ll?rev=322721&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/test/<wbr>Transforms/GlobalOpt/coldcc_<wbr>stress_test.ll?rev=322721&<wbr>view=auto</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/test/Transforms/<wbr>GlobalOpt/coldcc_stress_test.<wbr>ll (added)<br>
+++ llvm/trunk/test/Transforms/<wbr>GlobalOpt/coldcc_stress_test.<wbr>ll Wed Jan 17 10:22:55 2018<br>
@@ -0,0 +1,48 @@<br>
+; RUN: opt < %s -globalopt -S -enable-coldcc-stress-test -mtriple=powerpc64le-unknown-<wbr>linux-gnu | FileCheck %s -check-prefix=COLDCC<br>
+; RUN: opt < %s -globalopt -S | FileCheck %s -check-prefix=CHECK<br>
+<br>
+define internal i32 @callee_default(i32* %m) {<br>
+; COLDCC-LABEL: define internal coldcc i32 @callee_default<br>
+; CHECK-LABEL: define internal fastcc i32 @callee_default<br>
+  %v = load i32, i32* %m<br>
+  ret i32 %v<br>
+}<br>
+<br>
+define internal fastcc i32 @callee_fastcc(i32* %m) {<br>
+; COLDCC-LABEL: define internal fastcc i32 @callee_fastcc<br>
+; CHECK-LABEL: define internal fastcc i32 @callee_fastcc<br>
+  %v = load i32, i32* %m<br>
+  ret i32 %v<br>
+}<br>
+<br>
+define internal coldcc i32 @callee_coldcc(i32* %m) {<br>
+; COLDCC-LABEL: define internal coldcc i32 @callee_coldcc<br>
+; CHECK-LABEL: define internal coldcc i32 @callee_coldcc<br>
+  %v = load i32, i32* %m<br>
+  ret i32 %v<br>
+}<br>
+<br>
+define i32 @callee(i32* %m) {<br>
+  %v = load i32, i32* %m<br>
+  ret i32 %v<br>
+}<br>
+<br>
+define void @caller() {<br>
+  %m = alloca i32<br>
+  call i32 @callee_default(i32* %m)<br>
+  call fastcc i32 @callee_fastcc(i32* %m)<br>
+  call coldcc i32 @callee_coldcc(i32* %m)<br>
+  call i32 @callee(i32* %m)<br>
+  ret void<br>
+}<br>
+<br>
+; COLDCC-LABEL: define void @caller()<br>
+; COLDCC: call coldcc i32 @callee_default<br>
+; COLDCC: call fastcc i32 @callee_fastcc<br>
+; COLDCC: call coldcc i32 @callee_coldcc<br>
+; COLDCC: call i32 @callee<br>
+; CHECK-LABEL: define void @caller()<br>
+; CHECK: call fastcc i32 @callee_default<br>
+; CHECK: call fastcc i32 @callee_fastcc<br>
+; CHECK: call coldcc i32 @callee_coldcc<br>
+; CHECK: call i32 @callee<br>
<br>
<br>
______________________________<wbr>_________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-commits</a><br>
</blockquote></div><br></div>