[llvm-commits] [llvm] r162799 - in /llvm/trunk: include/llvm/Analysis/Passes.h include/llvm/Analysis/ProfileDataLoader.h include/llvm/InitializePasses.h include/llvm/LinkAllPasses.h lib/Analysis/Analysis.cpp lib/Analysis/CMakeLists.txt lib/Analysis/ProfileDataLoader.cpp lib/Analysis/ProfileDataLoaderPass.cpp lib/Analysis/ProfileInfo.cpp test/Analysis/Profiling/load-branch-weights-ifs.ll test/Analysis/Profiling/load-branch-weights-loops.ll test/Analysis/Profiling/load-branch-weights-switches.ll

Thu Aug 30 13:24:49 PDT 2012

Hi Alastair,

The patch looks fine. Thank you for doing it! Let me know if you don't have commit access and I can apply it for you.

If the namespace issue is messing things up, then you can leave it. It was a minor issue. I'm still not 100% convinced about the weight code. I doubt that you're gaining much by expanding to uint64_t and then back to unsigned. If anything, you should probably just use uint64_t directly in the whole function instead of casting (which, gross):

static uint64_t AddCounts(uint64_t A, uint64_t B) {
  // ...
}

Would this address your concerns?

-bw

On Aug 29, 2012, at 6:06 PM, Alastair Murray <alastairmurray42 at gmail.com> wrote:

> Hi Bill,
> 
> Thank you for the comments.  Most of what you mentioned has been addressed in the attached patch.  std::vector and std::map are now not used at all (Evan Cheng also requested this, CC'ed).
> 
> Three points.  Firstly, regarding the use of 'namespace llvm', if I remove that and add 'llvm::' it results in the following warning:
> 
> /home/alym/llvm/llvm/lib/Analysis/ProfileDataLoader.cpp:26:47: warning: first declaration of static data member specialization of 'ID' outside namespace 'llvm' is a C++11 extension [-Wc++11-extensions]
> char llvm::ProfileDataT<Function,BasicBlock>::ID = 0;
> 
> Secondly, regarding the truncation.  The idea is to saturate profiled weights when summing the weights from multiple profile runs (to UINT32_MAX-1).  I have left this in, but the code could be simplified to just overflow.
> 
> Thirdly, this patch does not yet make use of MemoryBuffer.  That will follow in a separate patch.
> 
> Regards,
> Alastair.
> 
> 
> On 29/08/12 22:21, Bill Wendling wrote:
>> Hi Manman & Alastair,
>> 
>> Comments inline.
>> 
>> -bw
>> 
>> On Aug 28, 2012, at 3:21 PM, Manman Ren <mren at apple.com> wrote:
>> 
>>> Author: mren
>>> Date: Tue Aug 28 17:21:25 2012
>>> New Revision: 162799
>>> 
>>> URL: http://llvm.org/viewvc/llvm-project?rev=162799&view=rev
>>> Log:
>>> Profile: set branch weight metadata with data generated from profiling.
>>> 
>>> This patch implements ProfileDataLoader which loads profile data generated by
>>> -insert-edge-profiling and updates branch weight metadata accordingly.
>>> 
>>> Patch by Alastair Murray.
>>> 
>>> Added: llvm/trunk/include/llvm/Analysis/ProfileDataLoader.h
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/ProfileDataLoader.h?rev=162799&view=auto
>>> ==============================================================================
>>> --- llvm/trunk/include/llvm/Analysis/ProfileDataLoader.h (added)
>>> +++ llvm/trunk/include/llvm/Analysis/ProfileDataLoader.h Tue Aug 28 17:21:25 2012
>>> @@ -0,0 +1,148 @@
>>> +//===- ProfileDataLoader.h - Load & convert profile info ----*- C++ -*-===//
>>> +//
>>> +//                      The LLVM Compiler Infrastructure
>>> +//
>>> +// This file is distributed under the University of Illinois Open Source
>>> +// License. See LICENSE.TXT for details.
>>> +//
>>> +//===----------------------------------------------------------------------===//
>>> +//
>>> +// The ProfileDataLoader class is used to load profiling data from a dump file.
>>> +// The ProfileDataT<FType, BType> class is used to store the mapping of this
>>> +// data to control flow edges.
>>> +//
>>> +//===----------------------------------------------------------------------===//
>>> +
>>> +#ifndef LLVM_ANALYSIS_PROFILEDATALOADER_H
>>> +#define LLVM_ANALYSIS_PROFILEDATALOADER_H
>>> +
>>> +#include "llvm/Support/Debug.h"
>>> +#include "llvm/Support/ErrorHandling.h"
>>> +#include <vector>
>>> +#include <string>
>>> +#include <map>
>>> +
>>> +namespace llvm {
>>> +
>>> +class ModulePass;
>>> +class Function;
>>> +class BasicBlock;
>>> +
>>> +// Helpers for dumping edges to dbgs().
>>> +raw_ostream& operator<<(raw_ostream &O, std::pair<const BasicBlock *,
>>> +                                                  const BasicBlock *> E);
>>> +raw_ostream& operator<<(raw_ostream &O, const BasicBlock *BB);
>>> +raw_ostream& operator<<(raw_ostream &O, const Function *F);
>>> +
>>> +/// \brief The ProfileDataT<FType, BType> class is used to store the mapping of
>>> +/// profiling data to control flow edges.
>>> +///
>>> +/// An edge is defined by its source and sink basic blocks.
>>> +template<class FType, class BType>
>>> +class ProfileDataT {
>>> +  public:
>>> +  // The profiling information defines an Edge by its source and sink basic
>>> +  // blocks.
>>> +  typedef std::pair<const BType*, const BType*> Edge;
>>> +
>>> +  private:
>>> +  typedef std::map<Edge, unsigned> EdgeWeights;
>>> +
>> Is a std::map really necessary here? Could you get away with a DenseMap instead?
>> 
>>> +  /// \brief Count the number of times a transition between two blocks is
>>> +  /// executed.
>>> +  ///
>>> +  /// As a special case, we also hold an edge from the null BasicBlock to the
>>> +  /// entry block to indicate how many times the function was entered.
>>> +  std::map<const FType*, EdgeWeights> EdgeInformation;
>>> +
>>> +  public:
>>> +  static char ID; // Class identification, replacement for typeinfo
>>> +  ProfileDataT() {};
>>> +  ~ProfileDataT() {};
>>> +
>>> +  /// getFunction() - Returns the Function for an Edge.
>>> +  static const FType* getFunction(Edge e) {
>> 
>> We typically place the '*' next to the function name.
>> 
>>> +    // e.first may be NULL
>>> +    assert(   ((!e.first) || (e.first->getParent() == e.second->getParent()))
>> 
>> Why the spaces here?
>> 
>>> +           && "A ProfileData::Edge can not be between two functions");
>>> +    assert(e.second && "A ProfileData::Edge must have a real sink");
>>> +    return e.second->getParent();
>>> +  }
>>> +
>>> +  /// getEdge() - Creates an Edge between two BasicBlocks.
>>> +  static Edge getEdge(const BType *Src, const BType *Dest) {
>>> +    return std::make_pair(Src, Dest);
>> 
>> I suspect that you can get away with doing "Edge(Src, Dest)" here instead of std::make_pair. It's a bit more readable.
>> 
>>> +  }
>>> +
>>> +  /// getEdgeWeight - Return the number of times that a given edge was
>>> +  /// executed.
>>> +  unsigned getEdgeWeight(Edge e) const {
>>> +    const FType *f = getFunction(e);
>>> +    assert(   (EdgeInformation.find(f) != EdgeInformation.end())
>>> +           && "No profiling information for function");
>>> +    EdgeWeights weights = EdgeInformation.find(f)->second;
>>> +
>>> +    assert(   (weights.find(e) != weights.end())
>>> +           && "No profiling information for edge");
>>> +    return weights.find(e)->second;
>>> +  }
>>> +
>>> +  /// addEdgeWeight - Add 'weight' to the already stored execution count for
>>> +  /// this edge.
>>> +  void addEdgeWeight(Edge e, unsigned weight) {
>>> +      EdgeInformation[getFunction(e)][e] += weight;
>>> +  }
>>> +};
>>> +
>>> +typedef ProfileDataT<Function, BasicBlock> ProfileData;
>>> +//typedef ProfileDataT<MachineFunction, MachineBasicBlock> MachineProfileData;
>>> +
>>> +/// The ProfileDataLoader class is used to load raw profiling data from the
>>> +/// dump file.
>>> +class ProfileDataLoader {
>>> +private:
>>> +  /// The name of the file where the raw profiling data is stored.
>>> +  const std::string &Filename;
>>> +
>>> +  /// A vector of the command line arguments used when the target program was
>>> +  /// run to generate profiling data.  One entry per program run.
>>> +  std::vector<std::string> CommandLines;
>>> +
>> Would a SmallVector work here?
>> 
>>> +  /// The raw values for how many times each edge was traversed, values from
>>> +  /// multiple program runs are accumulated.
>>> +  std::vector<unsigned> EdgeCounts;
>>> +
>> And here?
>> 
>>> 
>>> Added: llvm/trunk/lib/Analysis/ProfileDataLoader.cpp
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ProfileDataLoader.cpp?rev=162799&view=auto
>>> ==============================================================================
>>> --- llvm/trunk/lib/Analysis/ProfileDataLoader.cpp (added)
>>> +++ llvm/trunk/lib/Analysis/ProfileDataLoader.cpp Tue Aug 28 17:21:25 2012
>>> @@ -0,0 +1,186 @@
>>> +//===- ProfileDataLoader.cpp - Load profile information from disk ---------===//
>>> +//
>>> +//                      The LLVM Compiler Infrastructure
>>> +//
>>> +// This file is distributed under the University of Illinois Open Source
>>> +// License. See LICENSE.TXT for details.
>>> +//
>>> +//===----------------------------------------------------------------------===//
>>> +//
>>> +// The ProfileDataLoader class is used to load raw profiling data from the dump
>>> +// file.
>>> +//
>>> +//===----------------------------------------------------------------------===//
>>> +
>>> +#include "llvm/Module.h"
>>> +#include "llvm/InstrTypes.h"
>>> +#include "llvm/Analysis/ProfileDataLoader.h"
>>> +#include "llvm/Analysis/ProfileDataTypes.h"
>>> +#include "llvm/Support/raw_ostream.h"
>>> +#include <cstdio>
>>> +#include <cstdlib>
>>> +using namespace llvm;
>>> +
>>> +namespace llvm {
>>> +
>> This namespace shouldn't be necessary. You can just add 'llvm::' to the function name.
>> 
>>> +template<>
>>> +char ProfileDataT<Function,BasicBlock>::ID = 0;
>>> +
>>> +raw_ostream& operator<<(raw_ostream &O, const Function *F) {
>>> +  return O << F->getName();
>>> +}
>>> +
>>> +raw_ostream& operator<<(raw_ostream &O, const BasicBlock *BB) {
>>> +  return O << BB->getName();
>>> +}
>>> +
>>> +raw_ostream& operator<<(raw_ostream &O, std::pair<const BasicBlock *, const BasicBlock *> E) {
>>> +  O << "(";
>>> +
>>> +  if (E.first)
>>> +    O << E.first;
>>> +  else
>>> +    O << "0";
>>> +
>>> +  O << ",";
>>> +
>>> +  if (E.second)
>>> +    O << E.second;
>>> +  else
>>> +    O << "0";
>>> +
>>> +  return O << ")";
>>> +}
>>> +
>>> +} // namespace llvm
>>> +
>>> +/// ByteSwap - Byteswap 'Var' if 'Really' is true.  Required when the compiler
>>> +/// host and target have different endianness.
>>> +static inline unsigned ByteSwap(unsigned Var, bool Really) {
>> 
>> Gross! What do you mean with the 'Really'? Either call this an expect it to do what it says or don't call it at all.
>> 
>>> +  if (!Really) return Var;
>>> +  return ((Var & (255U<< 0U)) << 24U) |
>>> +         ((Var & (255U<< 8U)) <<  8U) |
>>> +         ((Var & (255U<<16U)) >>  8U) |
>>> +         ((Var & (255U<<24U)) >> 24U);
>>> +}
>>> +
>>> +/// AddCounts - Add 'A' and 'B', accounting for the fact that the value of one
>>> +/// (or both) may not be defined.
>>> +static unsigned AddCounts(unsigned A, unsigned B) {
>>> +  // If either value is undefined, use the other.
>>> +  // Undefined + undefined = undefined.
>>> +  if (A == ProfileDataLoader::Uncounted) return B;
>>> +  if (B == ProfileDataLoader::Uncounted) return A;
>>> +
>>> +  // Saturate to the maximum storable value.  This could change taken/nottaken
>>> +  // ratios, but is presumably better than wrapping and thus potentially
>>> +  // inverting ratios.
>>> +  unsigned long long tmp = (unsigned long long)A + (unsigned long long)B;
>> 
>> Please use something else like uint64_t instead of unsigned long long.
>> 
>>> +  if (tmp > (unsigned long long)ProfileDataLoader::MaxCount)
>>> +    tmp = ProfileDataLoader::MaxCount;
>>> +  return (unsigned)tmp;
>> 
>> ?? Why the truncation here? Why are you doing math in 'unsigned long long' for 'unsigned' values?
>> 
>>> +}
>>> +
>>> +/// ReadProfilingData - Load 'NumEntries' items of type 'T' from file 'F'
>>> +template <typename T>
>>> +static void ReadProfilingData(const char *ToolName, FILE *F,
>>> +                              std::vector<T> &Data, size_t NumEntries) {
>>> +  // Read in the block of data...
>>> +  if (fread(&Data[0], sizeof(T), NumEntries, F) != NumEntries) {
>>> +    errs() << ToolName << ": profiling data truncated!\n";
>>> +    perror(0);
>>> +    exit(1);
>>> +  }
>>> +}
>>> +
>>> +/// ReadProfilingNumEntries - Read how many entries are in this profiling data
>>> +/// packet.
>>> +static unsigned ReadProfilingNumEntries(const char *ToolName, FILE *F,
>>> +                                        bool ShouldByteSwap) {
>>> +  std::vector<unsigned> NumEntries(1);
>> 
>> Use a SmallVector here please.
>> 
>>> +  ReadProfilingData<unsigned>(ToolName, F, NumEntries, 1);
>>> +  return ByteSwap(NumEntries[0], ShouldByteSwap);
>> 
>> Predicate calling ByteSwap on ShouldByteSwap instead of passing ShouldByteSwap in as a parameter.
>> 
>>> +}
>>> +
>>> +/// ReadProfilingBlock - Read the number of entries in the next profiling data
>>> +/// packet and then accumulate the entries into 'Data'.
>>> +static void ReadProfilingBlock(const char *ToolName, FILE *F,
>>> +                               bool ShouldByteSwap,
>>> +                               std::vector<unsigned> &Data) {
>>> +  // Read the number of entries...
>>> +  unsigned NumEntries = ReadProfilingNumEntries(ToolName, F, ShouldByteSwap);
>>> +
>>> +  // Read in the data.
>>> +  std::vector<unsigned> TempSpace(NumEntries);
>>> +  ReadProfilingData<unsigned>(ToolName, F, TempSpace, (size_t)NumEntries);
>>> +
>>> +  // Make sure we have enough space ...
>>> +  if (Data.size() < NumEntries)
>>> +    Data.resize(NumEntries, ProfileDataLoader::Uncounted);
>>> +
>>> +  // Accumulate the data we just read into the existing data.
>>> +  for (unsigned i = 0; i < NumEntries; ++i) {
>>> +    Data[i] = AddCounts(ByteSwap(TempSpace[i], ShouldByteSwap), Data[i]);
>>> +  }
>>> +}
>>> +
>>> +/// ReadProfilingArgBlock - Read the command line arguments that the progam was
>>> +/// run with when the current profiling data packet(s) were generated.
>>> +static void ReadProfilingArgBlock(const char *ToolName, FILE *F,
>> 
>> Please see about using MemoryBuffer instead of FILE*.
>> 
>>> +                                  bool ShouldByteSwap,
>>> +                                  std::vector<std::string> &CommandLines) {
>>> +  // Read the number of bytes ...
>>> +  unsigned ArgLength = ReadProfilingNumEntries(ToolName, F, ShouldByteSwap);
>>> +
>>> +  // Read in the arguments (if there are any to read).  Round up the length to
>>> +  // the nearest 4-byte multiple.
>>> +  std::vector<char> Args(ArgLength+4);
>> 
>> Try using SmallVector instead of std::vector when you can.
>> 
>>> +  if (ArgLength)
>>> +    ReadProfilingData<char>(ToolName, F, Args, (ArgLength+3) & ~3);
>>> +
>>> +  // Store the arguments.
>>> +  CommandLines.push_back(std::string(&Args[0], &Args[ArgLength]));
>>> +}
>>> +
>>> +const unsigned ProfileDataLoader::Uncounted = ~0U;
>>> +const unsigned ProfileDataLoader::MaxCount = ~0U - 1U;
>>> +
>>> +/// ProfileDataLoader ctor - Read the specified profiling data file, exiting
>>> +/// the program if the file is invalid or broken.
>>> +ProfileDataLoader::ProfileDataLoader(const char *ToolName,
>>> +                                     const std::string &Filename)
>>> +  : Filename(Filename) {
>>> +  FILE *F = fopen(Filename.c_str(), "rb");
>> 
>> Look into using MemoryBuffer for this instead.
>> 
>>> +  if (F == 0) {
>>> +    errs() << ToolName << ": Error opening '" << Filename << "': ";
>>> +    perror(0);
>>> +    exit(1);
>> 
>> Please don't call 'exit' from the library. If you need to do something drastic, use something like 'report_fatal_error'. (Same with the 'exit' call above.)
>> 
>>> Added: llvm/trunk/lib/Analysis/ProfileDataLoaderPass.cpp
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ProfileDataLoaderPass.cpp?rev=162799&view=auto
>>> ==============================================================================
>>> --- llvm/trunk/lib/Analysis/ProfileDataLoaderPass.cpp (added)
>>> +++ llvm/trunk/lib/Analysis/ProfileDataLoaderPass.cpp Tue Aug 28 17:21:25 2012
>> 
>> +/// readEdge - Take the value from a profile counter and assign it to an edge.
>> +void ProfileMetadataLoaderPass::readEdge(unsigned ReadCount,
>> +                                         ProfileData &PB, ProfileData::Edge e,
>> +                                         std::vector<unsigned> &Counters) {
>> 
>> Counters doesn't appear to be modified here. If so, please use `ArrayRef' instead.
>> 
>> +  if (ReadCount < Counters.size()) {
>> 
>> Early exit here instead of wrapping the whole thing in the 'if-then' statement.
>> 
>> +    unsigned weight = Counters[ReadCount];
>> +    assert(weight != ProfileDataLoader::Uncounted);
>> +    PB.addEdgeWeight(e, weight);
>> +
>> +    DEBUG(dbgs() << "-- Read Edge Counter for " << e
>> +                 << " (# "<< (ReadCount) << "): "
>> +                 << PB.getEdgeWeight(e) << "\n");
>> +  }
>> +}
>> 
>>> +/// setBranchWeightMetadata - Translate the counter values associated with each
>>> +/// edge into branch weights for each conditional branch (a branch with 2 or
>>> +/// more desinations).
>>> +void ProfileMetadataLoaderPass::setBranchWeightMetadata(Module &M,
>>> +                                                        ProfileData &PB) {
>>> +  for (Module::iterator F = M.begin(), E = M.end(); F != E; ++F) {
>>> +    if (F->isDeclaration()) continue;
>>> +    DEBUG(dbgs() << "Setting branch metadata in '" << F->getName() << "'\n");
>>> +
>>> +    for (Function::iterator BB = F->begin(), E = F->end(); BB != E; ++BB) {
>>> +      TerminatorInst *TI = BB->getTerminator();
>>> +      unsigned NumSuccessors = TI->getNumSuccessors();
>>> +
>>> +      // If there is only one successor then we can not set a branch
>>> +      // probability as the target is certain.
>>> +      if (NumSuccessors < 2) continue;
>>> +
>>> +      // Load the weights of all edges leading from this terminator.
>>> +      DEBUG(dbgs() << "-- Terminator with " << NumSuccessors
>>> +                   << " successors:\n");
>>> +      std::vector<uint32_t> Weights(NumSuccessors);
>> 
>> Try using a SmallVector here.
>> 
>>> +      for (unsigned s = 0 ; s < NumSuccessors ; ++s) {
>>> +          ProfileData::Edge edge = PB.getEdge(BB, TI->getSuccessor(s));
>>> +          Weights[s] = (uint32_t)PB.getEdgeWeight(edge);
>>> +          DEBUG(dbgs() << "---- Edge '" << edge << "' has weight "
>>> +                       << Weights[s] << "\n");
>>> +      }
>>> +
>>> +      // Set branch weight metadata.  This will set branch probabilities of
>>> +      // 100%/0% if that is true of the dynamic execution.
>>> +      // BranchProbabilityInfo can account for this when it loads this metadata
>>> +      // (it gives the unexectuted branch a weight of 1 for the purposes of
>>> +      // probability calculations).
>>> +      MDBuilder MDB(TI->getContext());
>>> +      MDNode *Node = MDB.createBranchWeights(Weights);
>>> +      TI->setMetadata(LLVMContext::MD_prof, Node);
>>> +      NumTermsAnnotated++;
>>> +    }
>>> +  }
>>> +}
>>> +
>>> +bool ProfileMetadataLoaderPass::runOnModule(Module &M) {
>>> +  ProfileDataLoader PDL("profile-data-loader", Filename);
>>> +  ProfileData PB;
>>> +
>>> +  std::vector<unsigned> Counters = PDL.getRawEdgeCounts();
>>> +
>> 
>> Why are you making a copy of 'PDL.getRawEdgeCounts()' here? Is this array modified? (It didn't look like it at first...) If it's not modified, please change it into an ArrayRef<unsigned>.
>> 
>>> +  unsigned ReadCount = matchEdges(M, PB, Counters);
>>> +
>>> +  if (ReadCount != Counters.size()) {
>>> +    errs() << "WARNING: profile information is inconsistent with "
>>> +           << "the current program!\n";
>>> +  }
>>> +  NumEdgesRead = ReadCount;
>>> +
>>> +  setBranchWeightMetadata(M, PB);
>>> +
>>> +  return ReadCount > 0;
>>> +}
>> 
>> -bw
>> 
>> 
> 
> <weights_5_fix_profiler_loader.patch>