[llvm] r198972 - Propagation of profile samples through the CFG.

Diego Novillo dnovillo at google.com
Fri Jan 10 15:23:46 PST 2014


Author: dnovillo
Date: Fri Jan 10 17:23:46 2014
New Revision: 198972

URL: http://llvm.org/viewvc/llvm-project?rev=198972&view=rev
Log:
Propagation of profile samples through the CFG.

This adds a propagation heuristic to convert instruction samples
into branch weights. It implements a similar heuristic to the one
implemented by Dehao Chen on GCC.

The propagation proceeds in 3 phases:

1- Assignment of block weights. All the basic blocks in the function
   are initial assigned the same weight as their most frequently
   executed instruction.

2- Creation of equivalence classes. Since samples may be missing from
   blocks, we can fill in the gaps by setting the weights of all the
   blocks in the same equivalence class to the same weight. To compute
   the concept of equivalence, we use dominance and loop information.
   Two blocks B1 and B2 are in the same equivalence class if B1
   dominates B2, B2 post-dominates B1 and both are in the same loop.

3- Propagation of block weights into edges. This uses a simple
   propagation heuristic. The following rules are applied to every
   block B in the CFG:

   - If B has a single predecessor/successor, then the weight
     of that edge is the weight of the block.

   - If all the edges are known except one, and the weight of the
     block is already known, the weight of the unknown edge will
     be the weight of the block minus the sum of all the known
     edges. If the sum of all the known edges is larger than B's weight,
     we set the unknown edge weight to zero.

   - If there is a self-referential edge, and the weight of the block is
     known, the weight for that edge is set to the weight of the block
     minus the weight of the other incoming edges to that block (if
     known).

Since this propagation is not guaranteed to finalize for every CFG, we
only allow it to proceed for a limited number of iterations (controlled
by -sample-profile-max-propagate-iterations). It currently uses the same
GCC default of 100.

Before propagation starts, the pass builds (for each block) a list of
unique predecessors and successors. This is necessary to handle
identical edges in multiway branches. Since we visit all blocks and all
edges of the CFG, it is cleaner to build these lists once at the start
of the pass.

Finally, the patch fixes the computation of relative line locations.
The profiler emits lines relative to the function header. To discover
it, we traverse the compilation unit looking for the subprogram
corresponding to the function. The line number of that subprogram is the
line where the function begins. That becomes line zero for all the
relative locations.

Added:
    llvm/trunk/test/Transforms/SampleProfile/Inputs/propagate.prof
    llvm/trunk/test/Transforms/SampleProfile/Inputs/syntax.prof
    llvm/trunk/test/Transforms/SampleProfile/propagate.ll
Modified:
    llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp
    llvm/trunk/test/Transforms/SampleProfile/branch.ll
    llvm/trunk/test/Transforms/SampleProfile/syntax.ll

Modified: llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp?rev=198972&r1=198971&r2=198972&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp (original)
+++ llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp Fri Jan 10 17:23:46 2014
@@ -26,9 +26,14 @@
 
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/OwningPtr.h"
+#include "llvm/ADT/SmallSet.h"
+#include "llvm/ADT/SmallPtrSet.h"
 #include "llvm/ADT/StringMap.h"
 #include "llvm/ADT/StringRef.h"
-#include "llvm/DebugInfo/DIContext.h"
+#include "llvm/Analysis/Dominators.h"
+#include "llvm/Analysis/PostDominators.h"
+#include "llvm/Analysis/LoopInfo.h"
+#include "llvm/DebugInfo.h"
 #include "llvm/IR/Constants.h"
 #include "llvm/IR/Function.h"
 #include "llvm/IR/Instructions.h"
@@ -52,11 +57,19 @@ using namespace llvm;
 static cl::opt<std::string> SampleProfileFile(
     "sample-profile-file", cl::init(""), cl::value_desc("filename"),
     cl::desc("Profile file loaded by -sample-profile"), cl::Hidden);
+static cl::opt<unsigned> SampleProfileMaxPropagateIterations(
+    "sample-profile-max-propagate-iterations", cl::init(100),
+    cl::desc("Maximum number of iterations to go through when propagating "
+             "sample block/edge weights through the CFG."));
 
 namespace {
 
 typedef DenseMap<uint32_t, uint32_t> BodySampleMap;
 typedef DenseMap<BasicBlock *, uint32_t> BlockWeightMap;
+typedef DenseMap<BasicBlock *, BasicBlock *> EquivalenceClassMap;
+typedef std::pair<BasicBlock *, BasicBlock *> Edge;
+typedef DenseMap<Edge, uint32_t> EdgeWeightMap;
+typedef DenseMap<BasicBlock *, SmallVector<BasicBlock *, 8> > BlockEdgeMap;
 
 /// \brief Representation of the runtime profile for a function.
 ///
@@ -65,19 +78,34 @@ typedef DenseMap<BasicBlock *, uint32_t>
 /// in the function and a map of samples collected in every statement.
 class SampleFunctionProfile {
 public:
-  SampleFunctionProfile() : TotalSamples(0), TotalHeadSamples(0) {}
-
-  bool emitAnnotations(Function &F);
-  uint32_t getInstWeight(Instruction &I, unsigned FirstLineno,
-                         BodySampleMap &BodySamples);
-  uint32_t computeBlockWeight(BasicBlock *B, unsigned FirstLineno,
-                              BodySampleMap &BodySamples);
+  SampleFunctionProfile()
+      : TotalSamples(0), TotalHeadSamples(0), HeaderLineno(0), DT(0), PDT(0),
+        LI(0) {}
+
+  unsigned getFunctionLoc(Function &F);
+  bool emitAnnotations(Function &F, DominatorTree *DomTree,
+                       PostDominatorTree *PostDomTree, LoopInfo *Loops);
+  uint32_t getInstWeight(Instruction &I);
+  uint32_t getBlockWeight(BasicBlock *B);
   void addTotalSamples(unsigned Num) { TotalSamples += Num; }
   void addHeadSamples(unsigned Num) { TotalHeadSamples += Num; }
   void addBodySamples(unsigned LineOffset, unsigned Num) {
     BodySamples[LineOffset] += Num;
   }
   void print(raw_ostream &OS);
+  void printEdgeWeight(raw_ostream &OS, Edge E);
+  void printBlockWeight(raw_ostream &OS, BasicBlock *BB);
+  void printBlockEquivalence(raw_ostream &OS, BasicBlock *BB);
+  bool computeBlockWeights(Function &F);
+  void findEquivalenceClasses(Function &F);
+  void findEquivalencesFor(BasicBlock *BB1,
+                           SmallVector<BasicBlock *, 8> Descendants,
+                           DominatorTreeBase<BasicBlock> *DomTree);
+  void propagateWeights(Function &F);
+  uint32_t visitEdge(Edge E, unsigned *NumUnknownEdges, Edge *UnknownEdge);
+  void buildEdges(Function &F);
+  bool propagateThroughEdges(Function &F);
+  bool empty() { return BodySamples.empty(); }
 
 protected:
   /// \brief Total number of samples collected inside this function.
@@ -86,9 +114,15 @@ protected:
   /// inside this function and all its inlined callees.
   unsigned TotalSamples;
 
-  // \brief Total number of samples collected at the head of the function.
+  /// \brief Total number of samples collected at the head of the function.
   unsigned TotalHeadSamples;
 
+  /// \brief Line number for the function header. Used to compute relative
+  /// line numbers from the absolute line LOCs found in instruction locations.
+  /// The relative line numbers are needed to address the samples from the
+  /// profile file.
+  unsigned HeaderLineno;
+
   /// \brief Map line offsets to collected samples.
   ///
   /// Each entry in this map contains the number of samples
@@ -101,6 +135,37 @@ protected:
   /// The weight of a basic block is defined to be the maximum
   /// of all the instruction weights in that block.
   BlockWeightMap BlockWeights;
+
+  /// \brief Map edges to their computed weights.
+  ///
+  /// Edge weights are computed by propagating basic block weights in
+  /// SampleProfile::propagateWeights.
+  EdgeWeightMap EdgeWeights;
+
+  /// \brief Set of visited blocks during propagation.
+  SmallPtrSet<BasicBlock *, 128> VisitedBlocks;
+
+  /// \brief Set of visited edges during propagation.
+  SmallSet<Edge, 128> VisitedEdges;
+
+  /// \brief Equivalence classes for block weights.
+  ///
+  /// Two blocks BB1 and BB2 are in the same equivalence class if they
+  /// dominate and post-dominate each other, and they are in the same loop
+  /// nest. When this happens, the two blocks are guaranteed to execute
+  /// the same number of times.
+  EquivalenceClassMap EquivalenceClass;
+
+  /// \brief Dominance, post-dominance and loop information.
+  DominatorTree *DT;
+  PostDominatorTree *PDT;
+  LoopInfo *LI;
+
+  /// \brief Predecessors for each basic block in the CFG.
+  BlockEdgeMap Predecessors;
+
+  /// \brief Successors for each basic block in the CFG.
+  BlockEdgeMap Successors;
 };
 
 /// \brief Sample-based profile reader.
@@ -238,6 +303,9 @@ public:
 
   virtual void getAnalysisUsage(AnalysisUsage &AU) const {
     AU.setPreservesCFG();
+    AU.addRequired<LoopInfo>();
+    AU.addRequired<DominatorTree>();
+    AU.addRequired<PostDominatorTree>();
   }
 
 protected:
@@ -263,6 +331,34 @@ void SampleFunctionProfile::print(raw_os
   OS << "\n";
 }
 
+/// \brief Print the weight of edge \p E on stream \p OS.
+///
+/// \param OS  Stream to emit the output to.
+/// \param E  Edge to print.
+void SampleFunctionProfile::printEdgeWeight(raw_ostream &OS, Edge E) {
+  OS << "weight[" << E.first->getName() << "->" << E.second->getName()
+     << "]: " << EdgeWeights[E] << "\n";
+}
+
+/// \brief Print the equivalence class of block \p BB on stream \p OS.
+///
+/// \param OS  Stream to emit the output to.
+/// \param BB  Block to print.
+void SampleFunctionProfile::printBlockEquivalence(raw_ostream &OS,
+                                                  BasicBlock *BB) {
+  BasicBlock *Equiv = EquivalenceClass[BB];
+  OS << "equivalence[" << BB->getName()
+     << "]: " << ((Equiv) ? EquivalenceClass[BB]->getName() : "NONE") << "\n";
+}
+
+/// \brief Print the weight of block \p BB on stream \p OS.
+///
+/// \param OS  Stream to emit the output to.
+/// \param BB  Block to print.
+void SampleFunctionProfile::printBlockWeight(raw_ostream &OS, BasicBlock *BB) {
+  OS << "weight[" << BB->getName() << "]: " << BlockWeights[BB] << "\n";
+}
+
 /// \brief Print the function profile for \p FName on stream \p OS.
 ///
 /// \param OS Stream to emit the output to.
@@ -361,6 +457,12 @@ void SampleModuleProfile::loadText() {
       unsigned LineOffset, NumSamples;
       Matches[1].getAsInteger(10, LineOffset);
       Matches[2].getAsInteger(10, NumSamples);
+      // When dealing with instruction weights, we use the value
+      // zero to indicate the absence of a sample. If we read an
+      // actual zero from the profile file, return it as 1 to
+      // avoid the confusion later on.
+      if (NumSamples == 0)
+        NumSamples = 1;
       FProfile.addBodySamples(LineOffset, NumSamples);
     }
 
@@ -369,59 +471,39 @@ void SampleModuleProfile::loadText() {
   }
 }
 
-char SampleProfileLoader::ID = 0;
-INITIALIZE_PASS(SampleProfileLoader, "sample-profile", "Sample Profile loader",
-                false, false)
-
-bool SampleProfileLoader::doInitialization(Module &M) {
-  Profiler.reset(new SampleModuleProfile(Filename));
-  Profiler->loadText();
-  return true;
-}
-
-FunctionPass *llvm::createSampleProfileLoaderPass() {
-  return new SampleProfileLoader(SampleProfileFile);
-}
-
-FunctionPass *llvm::createSampleProfileLoaderPass(StringRef Name) {
-  return new SampleProfileLoader(Name);
-}
-
 /// \brief Get the weight for an instruction.
 ///
 /// The "weight" of an instruction \p Inst is the number of samples
 /// collected on that instruction at runtime. To retrieve it, we
 /// need to compute the line number of \p Inst relative to the start of its
-/// function. We use \p FirstLineno to compute the offset. We then
-/// look up the samples collected for \p Inst using \p BodySamples.
+/// function. We use HeaderLineno to compute the offset. We then
+/// look up the samples collected for \p Inst using BodySamples.
 ///
 /// \param Inst Instruction to query.
-/// \param FirstLineno Line number of the first instruction in the function.
-/// \param BodySamples Map of relative source line locations to samples.
 ///
 /// \returns The profiled weight of I.
-uint32_t SampleFunctionProfile::getInstWeight(Instruction &Inst,
-                                              unsigned FirstLineno,
-                                              BodySampleMap &BodySamples) {
-  unsigned LOffset = Inst.getDebugLoc().getLine() - FirstLineno + 1;
-  return BodySamples.lookup(LOffset);
+uint32_t SampleFunctionProfile::getInstWeight(Instruction &Inst) {
+  unsigned Lineno = Inst.getDebugLoc().getLine();
+  if (Lineno < HeaderLineno)
+    return 0;
+  unsigned LOffset = Lineno - HeaderLineno;
+  uint32_t Weight = BodySamples.lookup(LOffset);
+  DEBUG(dbgs() << "    " << Lineno << ":" << Inst.getDebugLoc().getCol() << ":"
+               << Inst << " (line offset: " << LOffset
+               << " - weight: " << Weight << ")\n");
+  return Weight;
 }
 
 /// \brief Compute the weight of a basic block.
 ///
 /// The weight of basic block \p B is the maximum weight of all the
-/// instructions in B.
+/// instructions in B. The weight of \p B is computed and cached in
+/// the BlockWeights map.
 ///
 /// \param B The basic block to query.
-/// \param FirstLineno The line number for the first line in the
-///     function holding B.
-/// \param BodySamples The map containing all the samples collected in that
-///     function.
 ///
 /// \returns The computed weight of B.
-uint32_t SampleFunctionProfile::computeBlockWeight(BasicBlock *B,
-                                                   unsigned FirstLineno,
-                                                   BodySampleMap &BodySamples) {
+uint32_t SampleFunctionProfile::getBlockWeight(BasicBlock *B) {
   // If we've computed B's weight before, return it.
   std::pair<BlockWeightMap::iterator, bool> Entry =
       BlockWeights.insert(std::make_pair(B, 0));
@@ -431,7 +513,7 @@ uint32_t SampleFunctionProfile::computeB
   // Otherwise, compute and cache B's weight.
   uint32_t Weight = 0;
   for (BasicBlock::iterator I = B->begin(), E = B->end(); I != E; ++I) {
-    uint32_t InstWeight = getInstWeight(*I, FirstLineno, BodySamples);
+    uint32_t InstWeight = getInstWeight(*I);
     if (InstWeight > Weight)
       Weight = InstWeight;
   }
@@ -439,30 +521,344 @@ uint32_t SampleFunctionProfile::computeB
   return Weight;
 }
 
-/// \brief Generate branch weight metadata for all branches in \p F.
+/// \brief Compute and store the weights of every basic block.
 ///
-/// For every branch instruction B in \p F, we compute the weight of the
-/// target block for each of the edges out of B. This is the weight
-/// that we associate with that branch.
-///
-/// TODO - This weight assignment will most likely be wrong if the
-/// target branch has more than two predecessors. This needs to be done
-/// using some form of flow propagation.
+/// This populates the BlockWeights map by computing
+/// the weights of every basic block in the CFG.
 ///
-/// Once all the branch weights are computed, we emit the MD_prof
-/// metadata on B using the computed values.
+/// \param F The function to query.
+bool SampleFunctionProfile::computeBlockWeights(Function &F) {
+  bool Changed = false;
+  DEBUG(dbgs() << "Block weights\n");
+  for (Function::iterator B = F.begin(), E = F.end(); B != E; ++B) {
+    uint32_t Weight = getBlockWeight(B);
+    Changed |= (Weight > 0);
+    DEBUG(printBlockWeight(dbgs(), B));
+  }
+
+  return Changed;
+}
+
+/// \brief Find equivalence classes for the given block.
+///
+/// This finds all the blocks that are guaranteed to execute the same
+/// number of times as \p BB1. To do this, it traverses all the the
+/// descendants of \p BB1 in the dominator or post-dominator tree.
+///
+/// A block BB2 will be in the same equivalence class as \p BB1 if
+/// the following holds:
+///
+/// 1- \p BB1 is a descendant of BB2 in the opposite tree. So, if BB2
+///    is a descendant of \p BB1 in the dominator tree, then BB2 should
+///    dominate BB1 in the post-dominator tree.
+///
+/// 2- Both BB2 and \p BB1 must be in the same loop.
+///
+/// For every block BB2 that meets those two requirements, we set BB2's
+/// equivalence class to \p BB1.
+///
+/// \param BB1  Block to check.
+/// \param Descendants  Descendants of \p BB1 in either the dom or pdom tree.
+/// \param DomTree  Opposite dominator tree. If \p Descendants is filled
+///                 with blocks from \p BB1's dominator tree, then
+///                 this is the post-dominator tree, and vice versa.
+void SampleFunctionProfile::findEquivalencesFor(
+    BasicBlock *BB1, SmallVector<BasicBlock *, 8> Descendants,
+    DominatorTreeBase<BasicBlock> *DomTree) {
+  for (SmallVectorImpl<BasicBlock *>::iterator I = Descendants.begin(),
+                                               E = Descendants.end();
+       I != E; ++I) {
+    BasicBlock *BB2 = *I;
+    bool IsDomParent = DomTree->dominates(BB2, BB1);
+    bool IsInSameLoop = LI->getLoopFor(BB1) == LI->getLoopFor(BB2);
+    if (BB1 != BB2 && VisitedBlocks.insert(BB2) && IsDomParent &&
+        IsInSameLoop) {
+      EquivalenceClass[BB2] = BB1;
+
+      // If BB2 is heavier than BB1, make BB2 have the same weight
+      // as BB1.
+      //
+      // Note that we don't worry about the opposite situation here
+      // (when BB2 is lighter than BB1). We will deal with this
+      // during the propagation phase. Right now, we just want to
+      // make sure that BB1 has the largest weight of all the
+      // members of its equivalence set.
+      uint32_t &BB1Weight = BlockWeights[BB1];
+      uint32_t &BB2Weight = BlockWeights[BB2];
+      BB1Weight = std::max(BB1Weight, BB2Weight);
+    }
+  }
+}
+
+/// \brief Find equivalence classes.
+///
+/// Since samples may be missing from blocks, we can fill in the gaps by setting
+/// the weights of all the blocks in the same equivalence class to the same
+/// weight. To compute the concept of equivalence, we use dominance and loop
+/// information. Two blocks B1 and B2 are in the same equivalence class if B1
+/// dominates B2, B2 post-dominates B1 and both are in the same loop.
 ///
 /// \param F The function to query.
-bool SampleFunctionProfile::emitAnnotations(Function &F) {
+void SampleFunctionProfile::findEquivalenceClasses(Function &F) {
+  SmallVector<BasicBlock *, 8> DominatedBBs;
+  DEBUG(dbgs() << "\nBlock equivalence classes\n");
+  // Find equivalence sets based on dominance and post-dominance information.
+  for (Function::iterator B = F.begin(), E = F.end(); B != E; ++B) {
+    BasicBlock *BB1 = B;
+
+    // Compute BB1's equivalence class once.
+    if (EquivalenceClass.count(BB1)) {
+      DEBUG(printBlockEquivalence(dbgs(), BB1));
+      continue;
+    }
+
+    // By default, blocks are in their own equivalence class.
+    EquivalenceClass[BB1] = BB1;
+
+    // Traverse all the blocks dominated by BB1. We are looking for
+    // every basic block BB2 such that:
+    //
+    // 1- BB1 dominates BB2.
+    // 2- BB2 post-dominates BB1.
+    // 3- BB1 and BB2 are in the same loop nest.
+    //
+    // If all those conditions hold, it means that BB2 is executed
+    // as many times as BB1, so they are placed in the same equivalence
+    // class by making BB2's equivalence class be BB1.
+    DominatedBBs.clear();
+    DT->getDescendants(BB1, DominatedBBs);
+    findEquivalencesFor(BB1, DominatedBBs, PDT->DT);
+
+    // Repeat the same logic for all the blocks post-dominated by BB1.
+    // We are looking for every basic block BB2 such that:
+    //
+    // 1- BB1 post-dominates BB2.
+    // 2- BB2 dominates BB1.
+    // 3- BB1 and BB2 are in the same loop nest.
+    //
+    // If all those conditions hold, BB2's equivalence class is BB1.
+    DominatedBBs.clear();
+    PDT->getDescendants(BB1, DominatedBBs);
+    findEquivalencesFor(BB1, DominatedBBs, DT->DT);
+
+    DEBUG(printBlockEquivalence(dbgs(), BB1));
+  }
+
+  // Assign weights to equivalence classes.
+  //
+  // All the basic blocks in the same equivalence class will execute
+  // the same number of times. Since we know that the head block in
+  // each equivalence class has the largest weight, assign that weight
+  // to all the blocks in that equivalence class.
+  DEBUG(dbgs() << "\nAssign the same weight to all blocks in the same class\n");
+  for (Function::iterator B = F.begin(), E = F.end(); B != E; ++B) {
+    BasicBlock *BB = B;
+    BasicBlock *EquivBB = EquivalenceClass[BB];
+    if (BB != EquivBB)
+      BlockWeights[BB] = BlockWeights[EquivBB];
+    DEBUG(printBlockWeight(dbgs(), BB));
+  }
+}
+
+/// \brief Visit the given edge to decide if it has a valid weight.
+///
+/// If \p E has not been visited before, we copy to \p UnknownEdge
+/// and increment the count of unknown edges.
+///
+/// \param E  Edge to visit.
+/// \param NumUnknownEdges  Current number of unknown edges.
+/// \param UnknownEdge  Set if E has not been visited before.
+///
+/// \returns E's weight, if known. Otherwise, return 0.
+uint32_t SampleFunctionProfile::visitEdge(Edge E, unsigned *NumUnknownEdges,
+                                          Edge *UnknownEdge) {
+  if (!VisitedEdges.count(E)) {
+    (*NumUnknownEdges)++;
+    *UnknownEdge = E;
+    return 0;
+  }
+
+  return EdgeWeights[E];
+}
+
+/// \brief Propagate weights through incoming/outgoing edges.
+///
+/// If the weight of a basic block is known, and there is only one edge
+/// with an unknown weight, we can calculate the weight of that edge.
+///
+/// Similarly, if all the edges have a known count, we can calculate the
+/// count of the basic block, if needed.
+///
+/// \param F  Function to process.
+///
+/// \returns  True if new weights were assigned to edges or blocks.
+bool SampleFunctionProfile::propagateThroughEdges(Function &F) {
   bool Changed = false;
-  unsigned FirstLineno = inst_begin(F)->getDebugLoc().getLine();
-  MDBuilder MDB(F.getContext());
+  DEBUG(dbgs() << "\nPropagation through edges\n");
+  for (Function::iterator BI = F.begin(), EI = F.end(); BI != EI; ++BI) {
+    BasicBlock *BB = BI;
+
+    // Visit all the predecessor and successor edges to determine
+    // which ones have a weight assigned already. Note that it doesn't
+    // matter that we only keep track of a single unknown edge. The
+    // only case we are interested in handling is when only a single
+    // edge is unknown (see setEdgeOrBlockWeight).
+    for (unsigned i = 0; i < 2; i++) {
+      uint32_t TotalWeight = 0;
+      unsigned NumUnknownEdges = 0;
+      Edge UnknownEdge, SelfReferentialEdge;
+
+      if (i == 0) {
+        // First, visit all predecessor edges.
+        for (size_t I = 0; I < Predecessors[BB].size(); I++) {
+          Edge E = std::make_pair(Predecessors[BB][I], BB);
+          TotalWeight += visitEdge(E, &NumUnknownEdges, &UnknownEdge);
+          if (E.first == E.second)
+            SelfReferentialEdge = E;
+        }
+      } else {
+        // On the second round, visit all successor edges.
+        for (size_t I = 0; I < Successors[BB].size(); I++) {
+          Edge E = std::make_pair(BB, Successors[BB][I]);
+          TotalWeight += visitEdge(E, &NumUnknownEdges, &UnknownEdge);
+        }
+      }
+
+      // After visiting all the edges, there are three cases that we
+      // can handle immediately:
+      //
+      // - All the edge weights are known (i.e., NumUnknownEdges == 0).
+      //   In this case, we simply check that the sum of all the edges
+      //   is the same as BB's weight. If not, we change BB's weight
+      //   to match. Additionally, if BB had not been visited before,
+      //   we mark it visited.
+      //
+      // - Only one edge is unknown and BB has already been visited.
+      //   In this case, we can compute the weight of the edge by
+      //   subtracting the total block weight from all the known
+      //   edge weights. If the edges weight more than BB, then the
+      //   edge of the last remaining edge is set to zero.
+      //
+      // - There exists a self-referential edge and the weight of BB is
+      //   known. In this case, this edge can be based on BB's weight.
+      //   We add up all the other known edges and set the weight on
+      //   the self-referential edge as we did in the previous case.
+      //
+      // In any other case, we must continue iterating. Eventually,
+      // all edges will get a weight, or iteration will stop when
+      // it reaches SampleProfileMaxPropagateIterations.
+      if (NumUnknownEdges <= 1) {
+        uint32_t &BBWeight = BlockWeights[BB];
+        if (NumUnknownEdges == 0) {
+          // If we already know the weight of all edges, the weight of the
+          // basic block can be computed. It should be no larger than the sum
+          // of all edge weights.
+          if (TotalWeight > BBWeight) {
+            BBWeight = TotalWeight;
+            Changed = true;
+            DEBUG(dbgs() << "All edge weights for " << BB->getName()
+                         << " known. Set weight for block: ";
+                  printBlockWeight(dbgs(), BB););
+          }
+          if (VisitedBlocks.insert(BB))
+            Changed = true;
+        } else if (NumUnknownEdges == 1 && VisitedBlocks.count(BB)) {
+          // If there is a single unknown edge and the block has been
+          // visited, then we can compute E's weight.
+          if (BBWeight >= TotalWeight)
+            EdgeWeights[UnknownEdge] = BBWeight - TotalWeight;
+          else
+            EdgeWeights[UnknownEdge] = 0;
+          VisitedEdges.insert(UnknownEdge);
+          Changed = true;
+          DEBUG(dbgs() << "Set weight for edge: ";
+                printEdgeWeight(dbgs(), UnknownEdge));
+        }
+      } else if (SelfReferentialEdge.first && VisitedBlocks.count(BB)) {
+        uint32_t &BBWeight = BlockWeights[BB];
+        // We have a self-referential edge and the weight of BB is known.
+        if (BBWeight >= TotalWeight)
+          EdgeWeights[SelfReferentialEdge] = BBWeight - TotalWeight;
+        else
+          EdgeWeights[SelfReferentialEdge] = 0;
+        VisitedEdges.insert(SelfReferentialEdge);
+        Changed = true;
+        DEBUG(dbgs() << "Set self-referential edge weight to: ";
+              printEdgeWeight(dbgs(), SelfReferentialEdge));
+      }
+    }
+  }
+
+  return Changed;
+}
+
+/// \brief Build in/out edge lists for each basic block in the CFG.
+///
+/// We are interested in unique edges. If a block B1 has multiple
+/// edges to another block B2, we only add a single B1->B2 edge.
+void SampleFunctionProfile::buildEdges(Function &F) {
+  for (Function::iterator I = F.begin(), E = F.end(); I != E; ++I) {
+    BasicBlock *B1 = I;
 
-  // Clear the block weights cache.
-  BlockWeights.clear();
+    // Add predecessors for B1.
+    SmallPtrSet<BasicBlock *, 16> Visited;
+    if (!Predecessors[B1].empty())
+      llvm_unreachable("Found a stale predecessors list in a basic block.");
+    for (pred_iterator PI = pred_begin(B1), PE = pred_end(B1); PI != PE; ++PI) {
+      BasicBlock *B2 = *PI;
+      if (Visited.insert(B2))
+        Predecessors[B1].push_back(B2);
+    }
+
+    // Add successors for B1.
+    Visited.clear();
+    if (!Successors[B1].empty())
+      llvm_unreachable("Found a stale successors list in a basic block.");
+    for (succ_iterator SI = succ_begin(B1), SE = succ_end(B1); SI != SE; ++SI) {
+      BasicBlock *B2 = *SI;
+      if (Visited.insert(B2))
+        Successors[B1].push_back(B2);
+    }
+  }
+}
+
+/// \brief Propagate weights into edges
+///
+/// The following rules are applied to every block B in the CFG:
+///
+/// - If B has a single predecessor/successor, then the weight
+///   of that edge is the weight of the block.
+///
+/// - If all incoming or outgoing edges are known except one, and the
+///   weight of the block is already known, the weight of the unknown
+///   edge will be the weight of the block minus the sum of all the known
+///   edges. If the sum of all the known edges is larger than B's weight,
+///   we set the unknown edge weight to zero.
+///
+/// - If there is a self-referential edge, and the weight of the block is
+///   known, the weight for that edge is set to the weight of the block
+///   minus the weight of the other incoming edges to that block (if
+///   known).
+void SampleFunctionProfile::propagateWeights(Function &F) {
+  bool Changed = true;
+  unsigned i = 0;
+
+  // Before propagation starts, build, for each block, a list of
+  // unique predecessors and successors. This is necessary to handle
+  // identical edges in multiway branches. Since we visit all blocks and all
+  // edges of the CFG, it is cleaner to build these lists once at the start
+  // of the pass.
+  buildEdges(F);
+
+  // Propagate until we converge or we go past the iteration limit.
+  while (Changed && i++ < SampleProfileMaxPropagateIterations) {
+    Changed = propagateThroughEdges(F);
+  }
 
-  // When we find a branch instruction: For each edge E out of the branch,
-  // the weight of E is the weight of the target block.
+  // Generate MD_prof metadata for every branch instruction using the
+  // edge weights computed during propagation.
+  DEBUG(dbgs() << "\nPropagation complete. Setting branch weights\n");
+  MDBuilder MDB(F.getContext());
   for (Function::iterator I = F.begin(), E = F.end(); I != E; ++I) {
     BasicBlock *B = I;
     TerminatorInst *TI = B->getTerminator();
@@ -471,22 +867,164 @@ bool SampleFunctionProfile::emitAnnotati
     if (!isa<BranchInst>(TI) && !isa<SwitchInst>(TI))
       continue;
 
+    DEBUG(dbgs() << "\nGetting weights for branch at line "
+                 << TI->getDebugLoc().getLine() << ":"
+                 << TI->getDebugLoc().getCol() << ".\n");
     SmallVector<uint32_t, 4> Weights;
-    unsigned NSuccs = TI->getNumSuccessors();
-    for (unsigned I = 0; I < NSuccs; ++I) {
+    bool AllWeightsZero = true;
+    for (unsigned I = 0; I < TI->getNumSuccessors(); ++I) {
       BasicBlock *Succ = TI->getSuccessor(I);
-      uint32_t Weight = computeBlockWeight(Succ, FirstLineno, BodySamples);
+      Edge E = std::make_pair(B, Succ);
+      uint32_t Weight = EdgeWeights[E];
+      DEBUG(dbgs() << "\t"; printEdgeWeight(dbgs(), E));
       Weights.push_back(Weight);
+      if (Weight != 0)
+        AllWeightsZero = false;
     }
 
-    TI->setMetadata(llvm::LLVMContext::MD_prof,
-                    MDB.createBranchWeights(Weights));
-    Changed = true;
+    // Only set weights if there is at least one non-zero weight.
+    // In any other case, let the analyzer set weights.
+    if (!AllWeightsZero) {
+      DEBUG(dbgs() << "SUCCESS. Found non-zero weights.\n");
+      TI->setMetadata(llvm::LLVMContext::MD_prof,
+                      MDB.createBranchWeights(Weights));
+    } else {
+      DEBUG(dbgs() << "SKIPPED. All branch weights are zero.\n");
+    }
+  }
+}
+
+/// \brief Get the line number for the function header.
+///
+/// This looks up function \p F in the current compilation unit and
+/// retrieves the line number where the function is defined. This is
+/// line 0 for all the samples read from the profile file. Every line
+/// number is relative to this line.
+///
+/// \param F  Function object to query.
+///
+/// \returns the line number where \p F is defined.
+unsigned SampleFunctionProfile::getFunctionLoc(Function &F) {
+  NamedMDNode *CUNodes = F.getParent()->getNamedMetadata("llvm.dbg.cu");
+  if (CUNodes) {
+    for (unsigned I = 0, E1 = CUNodes->getNumOperands(); I != E1; ++I) {
+      DICompileUnit CU(CUNodes->getOperand(I));
+      DIArray Subprograms = CU.getSubprograms();
+      for (unsigned J = 0, E2 = Subprograms.getNumElements(); J != E2; ++J) {
+        DISubprogram Subprogram(Subprograms.getElement(J));
+        if (Subprogram.describes(&F))
+          return Subprogram.getLineNumber();
+      }
+    }
+  }
+
+  report_fatal_error("No debug information found in function " + F.getName() +
+                     "\n");
+}
+
+/// \brief Generate branch weight metadata for all branches in \p F.
+///
+/// Branch weights are computed out of instruction samples using a
+/// propagation heuristic. Propagation proceeds in 3 phases:
+///
+/// 1- Assignment of block weights. All the basic blocks in the function
+///    are initial assigned the same weight as their most frequently
+///    executed instruction.
+///
+/// 2- Creation of equivalence classes. Since samples may be missing from
+///    blocks, we can fill in the gaps by setting the weights of all the
+///    blocks in the same equivalence class to the same weight. To compute
+///    the concept of equivalence, we use dominance and loop information.
+///    Two blocks B1 and B2 are in the same equivalence class if B1
+///    dominates B2, B2 post-dominates B1 and both are in the same loop.
+///
+/// 3- Propagation of block weights into edges. This uses a simple
+///    propagation heuristic. The following rules are applied to every
+///    block B in the CFG:
+///
+///    - If B has a single predecessor/successor, then the weight
+///      of that edge is the weight of the block.
+///
+///    - If all the edges are known except one, and the weight of the
+///      block is already known, the weight of the unknown edge will
+///      be the weight of the block minus the sum of all the known
+///      edges. If the sum of all the known edges is larger than B's weight,
+///      we set the unknown edge weight to zero.
+///
+///    - If there is a self-referential edge, and the weight of the block is
+///      known, the weight for that edge is set to the weight of the block
+///      minus the weight of the other incoming edges to that block (if
+///      known).
+///
+/// Since this propagation is not guaranteed to finalize for every CFG, we
+/// only allow it to proceed for a limited number of iterations (controlled
+/// by -sample-profile-max-propagate-iterations).
+///
+/// FIXME: Try to replace this propagation heuristic with a scheme
+/// that is guaranteed to finalize. A work-list approach similar to
+/// the standard value propagation algorithm used by SSA-CCP might
+/// work here.
+///
+/// Once all the branch weights are computed, we emit the MD_prof
+/// metadata on B using the computed values for each of its branches.
+///
+/// \param F The function to query.
+bool SampleFunctionProfile::emitAnnotations(Function &F, DominatorTree *DomTree,
+                                            PostDominatorTree *PostDomTree,
+                                            LoopInfo *Loops) {
+  bool Changed = false;
+
+  // Initialize invariants used during computation and propagation.
+  HeaderLineno = getFunctionLoc(F);
+  DEBUG(dbgs() << "Line number for the first instruction in " << F.getName()
+               << ": " << HeaderLineno << "\n");
+  DT = DomTree;
+  PDT = PostDomTree;
+  LI = Loops;
+
+  // Compute basic block weights.
+  Changed |= computeBlockWeights(F);
+
+  if (Changed) {
+    // Find equivalence classes.
+    findEquivalenceClasses(F);
+
+    // Propagate weights to all edges.
+    propagateWeights(F);
   }
 
   return Changed;
 }
 
+char SampleProfileLoader::ID = 0;
+INITIALIZE_PASS_BEGIN(SampleProfileLoader, "sample-profile",
+                      "Sample Profile loader", false, false)
+INITIALIZE_PASS_DEPENDENCY(DominatorTree)
+INITIALIZE_PASS_DEPENDENCY(PostDominatorTree)
+INITIALIZE_PASS_DEPENDENCY(LoopInfo)
+INITIALIZE_PASS_END(SampleProfileLoader, "sample-profile",
+                    "Sample Profile loader", false, false)
+
+bool SampleProfileLoader::doInitialization(Module &M) {
+  Profiler.reset(new SampleModuleProfile(Filename));
+  Profiler->loadText();
+  return true;
+}
+
+FunctionPass *llvm::createSampleProfileLoaderPass() {
+  return new SampleProfileLoader(SampleProfileFile);
+}
+
+FunctionPass *llvm::createSampleProfileLoaderPass(StringRef Name) {
+  return new SampleProfileLoader(Name);
+}
+
 bool SampleProfileLoader::runOnFunction(Function &F) {
-  return Profiler->getProfile(F).emitAnnotations(F);
+  DominatorTree *DT = &getAnalysis<DominatorTree>();
+  PostDominatorTree *PDT = &getAnalysis<PostDominatorTree>();
+  LoopInfo *LI = &getAnalysis<LoopInfo>();
+  SampleFunctionProfile &FunctionProfile = Profiler->getProfile(F);
+  if (!FunctionProfile.empty())
+    return FunctionProfile.emitAnnotations(F, DT, PDT, LI);
+  return false;
 }

Added: llvm/trunk/test/Transforms/SampleProfile/Inputs/propagate.prof
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/Inputs/propagate.prof?rev=198972&view=auto
==============================================================================
--- llvm/trunk/test/Transforms/SampleProfile/Inputs/propagate.prof (added)
+++ llvm/trunk/test/Transforms/SampleProfile/Inputs/propagate.prof Fri Jan 10 17:23:46 2014
@@ -0,0 +1,20 @@
+symbol table
+1
+_Z3fooiil
+_Z3fooiil:58139:0:16
+0: 0
+1: 0
+2: 0
+4: 1
+5: 10
+6: 0
+7: 5
+8: 3
+9: 0
+10: 0
+11: 6339
+12: 16191
+13: 8141
+16: 1
+18: 0
+19: 0

Added: llvm/trunk/test/Transforms/SampleProfile/Inputs/syntax.prof
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/Inputs/syntax.prof?rev=198972&view=auto
==============================================================================
--- llvm/trunk/test/Transforms/SampleProfile/Inputs/syntax.prof (added)
+++ llvm/trunk/test/Transforms/SampleProfile/Inputs/syntax.prof Fri Jan 10 17:23:46 2014
@@ -0,0 +1,6 @@
+symbol table
+1
+empty
+empty:100:0:2
+0: 0
+1: 100

Modified: llvm/trunk/test/Transforms/SampleProfile/branch.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/branch.ll?rev=198972&r1=198971&r2=198972&view=diff
==============================================================================
--- llvm/trunk/test/Transforms/SampleProfile/branch.ll (original)
+++ llvm/trunk/test/Transforms/SampleProfile/branch.ll Fri Jan 10 17:23:46 2014
@@ -46,8 +46,8 @@ if.end:
   tail call void @llvm.dbg.value(metadata !{i32 %call}, i64 0, metadata !17), !dbg !30
   %cmp1 = icmp sgt i32 %call, 100, !dbg !35
   br i1 %cmp1, label %for.body, label %if.end6, !dbg !35
-; CHECK: edge if.end -> for.body probability is 2243 / 2244 = 99.9554% [HOT edge]
-; CHECK: edge if.end -> if.end6 probability is 1 / 2244 = 0.0445633%
+; CHECK: edge if.end -> for.body probability is 1 / 2 = 50%
+; CHECK: edge if.end -> if.end6 probability is 1 / 2 = 50%
 
 for.body:                                         ; preds = %if.end, %for.body
   %u.016 = phi i32 [ %inc, %for.body ], [ 0, %if.end ]
@@ -65,8 +65,8 @@ for.body:
   tail call void @llvm.dbg.value(metadata !{i32 %inc}, i64 0, metadata !21), !dbg !38
   %exitcond = icmp eq i32 %inc, %call, !dbg !38
   br i1 %exitcond, label %if.end6, label %for.body, !dbg !38
-; CHECK: edge for.body -> if.end6 probability is 1 / 2244 = 0.0445633%
-; CHECK: edge for.body -> for.body probability is 2243 / 2244 = 99.9554% [HOT edge]
+; CHECK: edge for.body -> if.end6 probability is 1 / 10227 = 0.00977804
+; CHECK: edge for.body -> for.body probability is 10226 / 10227 = 99.9902% [HOT edge]
 
 if.end6:                                          ; preds = %for.body, %if.end
   %result.0 = phi double [ 0.000000e+00, %if.end ], [ %sub, %for.body ]

Added: llvm/trunk/test/Transforms/SampleProfile/propagate.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/propagate.ll?rev=198972&view=auto
==============================================================================
--- llvm/trunk/test/Transforms/SampleProfile/propagate.ll (added)
+++ llvm/trunk/test/Transforms/SampleProfile/propagate.ll Fri Jan 10 17:23:46 2014
@@ -0,0 +1,243 @@
+; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/propagate.prof | opt -analyze -branch-prob | FileCheck %s
+
+; Original C++ code for this test case:
+;
+; #include <stdio.h>
+;
+; long foo(int x, int y, long N) {
+;   if (x < y) {
+;     return y - x;
+;   } else {
+;     for (long i = 0; i < N; i++) {
+;       if (i > N / 3)
+;         x--;
+;       if (i > N / 4) {
+;         y++;
+;         x += 3;
+;       } else {
+;         for (unsigned j = 0; j < i; j++) {
+;           x += j;
+;           y -= 3;
+;         }
+;       }
+;     }
+;   }
+;   return y * x;
+; }
+;
+; int main() {
+;   int x = 5678;
+;   int y = 1234;
+;   long N = 999999;
+;   printf("foo(%d, %d, %ld) = %ld\n", x, y, N, foo(x, y, N));
+;   return 0;
+; }
+
+; ModuleID = 'propagate.cc'
+target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+ at .str = private unnamed_addr constant [24 x i8] c"foo(%d, %d, %ld) = %ld\0A\00", align 1
+
+; Function Attrs: nounwind uwtable
+define i64 @_Z3fooiil(i32 %x, i32 %y, i64 %N) #0 {
+entry:
+  %retval = alloca i64, align 8
+  %x.addr = alloca i32, align 4
+  %y.addr = alloca i32, align 4
+  %N.addr = alloca i64, align 8
+  %i = alloca i64, align 8
+  %j = alloca i32, align 4
+  store i32 %x, i32* %x.addr, align 4
+  store i32 %y, i32* %y.addr, align 4
+  store i64 %N, i64* %N.addr, align 8
+  %0 = load i32* %x.addr, align 4, !dbg !11
+  %1 = load i32* %y.addr, align 4, !dbg !11
+  %cmp = icmp slt i32 %0, %1, !dbg !11
+  br i1 %cmp, label %if.then, label %if.else, !dbg !11
+
+if.then:                                          ; preds = %entry
+  %2 = load i32* %y.addr, align 4, !dbg !13
+  %3 = load i32* %x.addr, align 4, !dbg !13
+  %sub = sub nsw i32 %2, %3, !dbg !13
+  %conv = sext i32 %sub to i64, !dbg !13
+  store i64 %conv, i64* %retval, !dbg !13
+  br label %return, !dbg !13
+
+if.else:                                          ; preds = %entry
+  store i64 0, i64* %i, align 8, !dbg !15
+  br label %for.cond, !dbg !15
+
+for.cond:                                         ; preds = %for.inc16, %if.else
+  %4 = load i64* %i, align 8, !dbg !15
+  %5 = load i64* %N.addr, align 8, !dbg !15
+  %cmp1 = icmp slt i64 %4, %5, !dbg !15
+  br i1 %cmp1, label %for.body, label %for.end18, !dbg !15
+; CHECK: edge for.cond -> for.body probability is 10 / 11 = 90.9091% [HOT edge]
+; CHECK: edge for.cond -> for.end18 probability is 1 / 11 = 9.09091%
+
+for.body:                                         ; preds = %for.cond
+  %6 = load i64* %i, align 8, !dbg !18
+  %7 = load i64* %N.addr, align 8, !dbg !18
+  %div = sdiv i64 %7, 3, !dbg !18
+  %cmp2 = icmp sgt i64 %6, %div, !dbg !18
+  br i1 %cmp2, label %if.then3, label %if.end, !dbg !18
+; CHECK: edge for.body -> if.then3 probability is 1 / 5 = 20%
+; CHECK: edge for.body -> if.end probability is 4 / 5 = 80%
+
+if.then3:                                         ; preds = %for.body
+  %8 = load i32* %x.addr, align 4, !dbg !21
+  %dec = add nsw i32 %8, -1, !dbg !21
+  store i32 %dec, i32* %x.addr, align 4, !dbg !21
+  br label %if.end, !dbg !21
+
+if.end:                                           ; preds = %if.then3, %for.body
+  %9 = load i64* %i, align 8, !dbg !22
+  %10 = load i64* %N.addr, align 8, !dbg !22
+  %div4 = sdiv i64 %10, 4, !dbg !22
+  %cmp5 = icmp sgt i64 %9, %div4, !dbg !22
+  br i1 %cmp5, label %if.then6, label %if.else7, !dbg !22
+; CHECK: edge if.end -> if.then6 probability is 3 / 6342 = 0.0473037%
+; CHECK: edge if.end -> if.else7 probability is 6339 / 6342 = 99.9527% [HOT edge]
+
+if.then6:                                         ; preds = %if.end
+  %11 = load i32* %y.addr, align 4, !dbg !24
+  %inc = add nsw i32 %11, 1, !dbg !24
+  store i32 %inc, i32* %y.addr, align 4, !dbg !24
+  %12 = load i32* %x.addr, align 4, !dbg !26
+  %add = add nsw i32 %12, 3, !dbg !26
+  store i32 %add, i32* %x.addr, align 4, !dbg !26
+  br label %if.end15, !dbg !27
+
+if.else7:                                         ; preds = %if.end
+  store i32 0, i32* %j, align 4, !dbg !28
+  br label %for.cond8, !dbg !28
+
+for.cond8:                                        ; preds = %for.inc, %if.else7
+  %13 = load i32* %j, align 4, !dbg !28
+  %conv9 = zext i32 %13 to i64, !dbg !28
+  %14 = load i64* %i, align 8, !dbg !28
+  %cmp10 = icmp slt i64 %conv9, %14, !dbg !28
+  br i1 %cmp10, label %for.body11, label %for.end, !dbg !28
+; CHECK: edge for.cond8 -> for.body11 probability is 16191 / 16192 = 99.9938% [HOT edge]
+; CHECK: edge for.cond8 -> for.end probability is 1 / 16192 = 0.00617589%
+
+for.body11:                                       ; preds = %for.cond8
+  %15 = load i32* %j, align 4, !dbg !31
+  %16 = load i32* %x.addr, align 4, !dbg !31
+  %add12 = add i32 %16, %15, !dbg !31
+  store i32 %add12, i32* %x.addr, align 4, !dbg !31
+  %17 = load i32* %y.addr, align 4, !dbg !33
+  %sub13 = sub nsw i32 %17, 3, !dbg !33
+  store i32 %sub13, i32* %y.addr, align 4, !dbg !33
+  br label %for.inc, !dbg !34
+
+for.inc:                                          ; preds = %for.body11
+  %18 = load i32* %j, align 4, !dbg !28
+  %inc14 = add i32 %18, 1, !dbg !28
+  store i32 %inc14, i32* %j, align 4, !dbg !28
+  br label %for.cond8, !dbg !28
+
+for.end:                                          ; preds = %for.cond8
+  br label %if.end15
+
+if.end15:                                         ; preds = %for.end, %if.then6
+  br label %for.inc16, !dbg !35
+
+for.inc16:                                        ; preds = %if.end15
+  %19 = load i64* %i, align 8, !dbg !15
+  %inc17 = add nsw i64 %19, 1, !dbg !15
+  store i64 %inc17, i64* %i, align 8, !dbg !15
+  br label %for.cond, !dbg !15
+
+for.end18:                                        ; preds = %for.cond
+  br label %if.end19
+
+if.end19:                                         ; preds = %for.end18
+  %20 = load i32* %y.addr, align 4, !dbg !36
+  %21 = load i32* %x.addr, align 4, !dbg !36
+  %mul = mul nsw i32 %20, %21, !dbg !36
+  %conv20 = sext i32 %mul to i64, !dbg !36
+  store i64 %conv20, i64* %retval, !dbg !36
+  br label %return, !dbg !36
+
+return:                                           ; preds = %if.end19, %if.then
+  %22 = load i64* %retval, !dbg !37
+  ret i64 %22, !dbg !37
+}
+
+; Function Attrs: uwtable
+define i32 @main() #1 {
+entry:
+  %retval = alloca i32, align 4
+  %x = alloca i32, align 4
+  %y = alloca i32, align 4
+  %N = alloca i64, align 8
+  store i32 0, i32* %retval
+  store i32 5678, i32* %x, align 4, !dbg !38
+  store i32 1234, i32* %y, align 4, !dbg !39
+  store i64 999999, i64* %N, align 8, !dbg !40
+  %0 = load i32* %x, align 4, !dbg !41
+  %1 = load i32* %y, align 4, !dbg !41
+  %2 = load i64* %N, align 8, !dbg !41
+  %3 = load i32* %x, align 4, !dbg !41
+  %4 = load i32* %y, align 4, !dbg !41
+  %5 = load i64* %N, align 8, !dbg !41
+  %call = call i64 @_Z3fooiil(i32 %3, i32 %4, i64 %5), !dbg !41
+  %call1 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([24 x i8]* @.str, i32 0, i32 0), i32 %0, i32 %1, i64 %2, i64 %call), !dbg !41
+  ret i32 0, !dbg !42
+}
+
+declare i32 @printf(i8*, ...) #2
+
+attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
+attributes #1 = { uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
+attributes #2 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!8, !9}
+!llvm.ident = !{!10}
+
+!0 = metadata !{i32 786449, metadata !1, i32 4, metadata !"clang version 3.5 ", i1 false, metadata !"", i32 0, metadata !2, metadata !2, metadata !3, metadata !2, metadata !2, metadata !""} ; [ DW_TAG_compile_unit ] [propagate.cc] [DW_LANG_C_plus_plus]
+!1 = metadata !{metadata !"propagate.cc", metadata !"."}
+!2 = metadata !{i32 0}
+!3 = metadata !{metadata !4, metadata !7}
+!4 = metadata !{i32 786478, metadata !1, metadata !5, metadata !"foo", metadata !"foo", metadata !"", i32 3, metadata !6, i1 false, i1 true, i32 0, i32 0, null, i32 256, i1 false, i64 (i32, i32, i64)* @_Z3fooiil, null, null, metadata !2, i32 3} ; [ DW_TAG_subprogram ] [line 3] [def] [foo]
+!5 = metadata !{i32 786473, metadata !1}          ; [ DW_TAG_file_type ] [propagate.cc]
+!6 = metadata !{i32 786453, i32 0, null, metadata !"", i32 0, i64 0, i64 0, i64 0, i32 0, null, metadata !2, i32 0, null, null, null} ; [ DW_TAG_subroutine_type ] [line 0, size 0, align 0, offset 0] [from ]
+!7 = metadata !{i32 786478, metadata !1, metadata !5, metadata !"main", metadata !"main", metadata !"", i32 24, metadata !6, i1 false, i1 true, i32 0, i32 0, null, i32 256, i1 false, i32 ()* @main, null, null, metadata !2, i32 24} ; [ DW_TAG_subprogram ] [line 24] [def] [main]
+!8 = metadata !{i32 2, metadata !"Dwarf Version", i32 4}
+!9 = metadata !{i32 1, metadata !"Debug Info Version", i32 1}
+!10 = metadata !{metadata !"clang version 3.5 "}
+!11 = metadata !{i32 4, i32 0, metadata !12, null}
+!12 = metadata !{i32 786443, metadata !1, metadata !4, i32 4, i32 0, i32 0} ; [ DW_TAG_lexical_block ] [propagate.cc]
+!13 = metadata !{i32 5, i32 0, metadata !14, null}
+!14 = metadata !{i32 786443, metadata !1, metadata !12, i32 4, i32 0, i32 1} ; [ DW_TAG_lexical_block ] [propagate.cc]
+!15 = metadata !{i32 7, i32 0, metadata !16, null}
+!16 = metadata !{i32 786443, metadata !1, metadata !17, i32 7, i32 0, i32 3} ; [ DW_TAG_lexical_block ] [propagate.cc]
+!17 = metadata !{i32 786443, metadata !1, metadata !12, i32 6, i32 0, i32 2} ; [ DW_TAG_lexical_block ] [propagate.cc]
+!18 = metadata !{i32 8, i32 0, metadata !19, null} ; [ DW_TAG_imported_declaration ]
+!19 = metadata !{i32 786443, metadata !1, metadata !20, i32 8, i32 0, i32 5} ; [ DW_TAG_lexical_block ] [propagate.cc]
+!20 = metadata !{i32 786443, metadata !1, metadata !16, i32 7, i32 0, i32 4} ; [ DW_TAG_lexical_block ] [propagate.cc]
+!21 = metadata !{i32 9, i32 0, metadata !19, null}
+!22 = metadata !{i32 10, i32 0, metadata !23, null}
+!23 = metadata !{i32 786443, metadata !1, metadata !20, i32 10, i32 0, i32 6} ; [ DW_TAG_lexical_block ] [propagate.cc]
+!24 = metadata !{i32 11, i32 0, metadata !25, null}
+!25 = metadata !{i32 786443, metadata !1, metadata !23, i32 10, i32 0, i32 7} ; [ DW_TAG_lexical_block ] [propagate.cc]
+!26 = metadata !{i32 12, i32 0, metadata !25, null}
+!27 = metadata !{i32 13, i32 0, metadata !25, null}
+!28 = metadata !{i32 14, i32 0, metadata !29, null}
+!29 = metadata !{i32 786443, metadata !1, metadata !30, i32 14, i32 0, i32 9} ; [ DW_TAG_lexical_block ] [propagate.cc]
+!30 = metadata !{i32 786443, metadata !1, metadata !23, i32 13, i32 0, i32 8} ; [ DW_TAG_lexical_block ] [propagate.cc]
+!31 = metadata !{i32 15, i32 0, metadata !32, null}
+!32 = metadata !{i32 786443, metadata !1, metadata !29, i32 14, i32 0, i32 10} ; [ DW_TAG_lexical_block ] [propagate.cc]
+!33 = metadata !{i32 16, i32 0, metadata !32, null}
+!34 = metadata !{i32 17, i32 0, metadata !32, null}
+!35 = metadata !{i32 19, i32 0, metadata !20, null}
+!36 = metadata !{i32 21, i32 0, metadata !4, null}
+!37 = metadata !{i32 22, i32 0, metadata !4, null}
+!38 = metadata !{i32 25, i32 0, metadata !7, null}
+!39 = metadata !{i32 26, i32 0, metadata !7, null}
+!40 = metadata !{i32 27, i32 0, metadata !7, null}
+!41 = metadata !{i32 28, i32 0, metadata !7, null}
+!42 = metadata !{i32 29, i32 0, metadata !7, null}

Modified: llvm/trunk/test/Transforms/SampleProfile/syntax.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/syntax.ll?rev=198972&r1=198971&r2=198972&view=diff
==============================================================================
--- llvm/trunk/test/Transforms/SampleProfile/syntax.ll (original)
+++ llvm/trunk/test/Transforms/SampleProfile/syntax.ll Fri Jan 10 17:23:46 2014
@@ -1,3 +1,4 @@
+; RUN: not opt < %s -sample-profile -sample-profile-file=%S/Inputs/syntax.prof 2>&1 | FileCheck -check-prefix=NO-DEBUG %s
 ; RUN: not opt < %s -sample-profile -sample-profile-file=missing.prof 2>&1 | FileCheck -check-prefix=MISSING-FILE %s
 ; RUN: not opt < %s -sample-profile -sample-profile-file=%S/Inputs/missing_symtab.prof 2>&1 | FileCheck -check-prefix=MISSING-SYMTAB %s
 ; RUN: not opt < %s -sample-profile -sample-profile-file=%S/Inputs/missing_num_syms.prof 2>&1 | FileCheck -check-prefix=MISSING-NUM-SYMS %s
@@ -9,7 +10,8 @@ define void @empty() {
 entry:
   ret void
 }
-; MISSING-FILE: LLVM ERROR: Could not open profile file missing.prof:
+; NO-DEBUG: LLVM ERROR: No debug information found in function empty
+; MISSING-FILE: LLVM ERROR: Could not open file missing.prof: No such file or directory
 ; MISSING-SYMTAB: LLVM ERROR: {{.*}}missing_symtab.prof:1: Expected 'symbol table', found 1
 ; MISSING-NUM-SYMS: LLVM ERROR: {{.*}}missing_num_syms.prof:2: Expected a number, found empty
 ; BAD-FN-HEADER: LLVM ERROR: {{.*}}bad_fn_header.prof:4: Expected 'mangled_name:NUM:NUM:NUM', found empty:100:BAD





More information about the llvm-commits mailing list