Automatic PGO - Initial implementation (1/N)

Joey Gouly Joey.Gouly at arm.com
Tue Sep 24 07:15:24 PDT 2013


Hi Diego,

Can you attach it as a patch, rather than in the text of the mail? (Using Phabricator: http://llvm-reviews.chandlerc.com/ would be even better)

Thanks,
Joey

-----Original Message-----
From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Diego Novillo
Sent: 24 September 2013 15:07
To: Chandler Carruth; Eric Christopher; llvm-commits at cs.uiuc.edu
Subject: Automatic PGO - Initial implementation (1/N)

This adds two options: -auto-profile and -auto-profile-file. The new
option causes the compiler to read a profile file and emit IR
metadata reflecting that profile.

The profile file is assumed to have been generated by an external
profile source. The profile information is converted into IR metadata,
which is later used by the analysis routines to estimate block
frequencies, edge weights and other related data.

External profile information files have no fixed format, each profiler
is free to define its own. This includes both the on-disk representation
of the profile and the kind of profile information stored in the file.
A common kind of profile is based on sampling (e.g., perf), which
essentially counts how many times each line of the program has been
executed during the run.

The only requirements is that each profiler must provide a way to
load its own profile format into internal data structures and
then a method to convert that data into IR annotations.

The AutoProfile pass is organized as a scalar transformation. On
startup, it reads the file given in -auto-profile-file to determine what
kind of profile it contains.  This file is assumed to contain profile
information for the whole application. The profile data in the file is
read and incorporated into the internal state of the corresponding
profiler.

To facilitate testing, I've organized the profilers to support two file
formats: text and native. The native format is whatever on-disk
representation the profiler wants to support, I think this will mostly
be bitcode files, but it could be anything the profiler wants to
support. To do this, every profiler must implement the
AutoProfiler::loadNative() function.

The text format is mostly meant for debugging. Records are separated by
newlines, but each profiler is free to interpret records as it sees fit.
Profilers must implement the AutoProfiler::loadText() function.

Finally, the pass will call AutoProfiler::emitAnnotations() for each
function in the current translation unit. This function needs to
translate the loaded profile into IR metadata, which the analyzer will
later be able to use.

This patch implements the first steps towards the above design. I've
implemented a sample-based flat profiler. The format of the profile is
fairly simplistic. Each sampled function contains a list of relative
line locations (from the start of the function) together with a count
representing how many samples were collected at that line during
execution. I generate this profile using perf and a separate converter
tool.

Currently, I have only implemented a text format for these profiles. I
am interested in initial feedback to the whole approach before I send
the other parts of the implementation for review.

This patch implements:

- The AutoProfile pass.
- The base AutoProfiler class with the core interface.
- A SampleBasedProfiler using the above interface. The profiler
  generates metadata autoprofile.samples on every IR instruction that
  matches the profiles.
- A text loader class to assist the implementation of
  AutoProfiler::loadText().

Caveats and questions:

1- I am almost certainly using the wrong APIs or using the right
   APIs in unorthodox ways. Please point me to better
   alternatives.

2- I was surprised to learn that line number information is not
   transferred into the IR unless we are emitting debug
   information. For sample-based profiling, I'm going to need
   line number information generated by the front-end
   independently of debug info. Eric, is that possible?

3- I have not included in this patch changes to the analyzer. I
   want to keep it focused to the profile loading and IR
   annotation. In the analyzer, we will have propagation of
   attributes and other fixes (e.g., from the samples it is
   possible to have instructions on the same basic block
   registered with differing number of samples). I also have not
   included changes to code motion to get rid of the autoprofile
   information.

4- I need to add test cases.

Mainly, I'm interested in making sure that this direction is
generally useful. I haven't given a lot of thought to other types
of profiling, but I'm certain any kind of tracing or other
execution frequency profiles can be adapted. Things like value
profiling may be a bit more involved, but mostly because I'm not
sure how the type and symbols are tracked in LLVM.

Thanks.  Diego.
---
 include/llvm/InitializePasses.h       |   1 +
 include/llvm/Transforms/Scalar.h      |   6 +
 lib/Transforms/Scalar/AutoProfile.cpp | 363 ++++++++++++++++++++++++++++++++++
 lib/Transforms/Scalar/CMakeLists.txt  |   1 +
 lib/Transforms/Scalar/Scalar.cpp      |   1 +
 5 files changed, 372 insertions(+)
 create mode 100644 lib/Transforms/Scalar/AutoProfile.cpp

diff --git a/include/llvm/InitializePasses.h b/include/llvm/InitializePasses.h
index 1b50bb2..c922228 100644
--- a/include/llvm/InitializePasses.h
+++ b/include/llvm/InitializePasses.h
@@ -70,6 +70,7 @@ void initializeAliasDebuggerPass(PassRegistry&);
 void initializeAliasSetPrinterPass(PassRegistry&);
 void initializeAlwaysInlinerPass(PassRegistry&);
 void initializeArgPromotionPass(PassRegistry&);
+void initializeAutoProfilePass(PassRegistry&);
 void initializeBarrierNoopPass(PassRegistry&);
 void initializeBasicAliasAnalysisPass(PassRegistry&);
 void initializeBasicCallGraphPass(PassRegistry&);
diff --git a/include/llvm/Transforms/Scalar.h b/include/llvm/Transforms/Scalar.h
index 51aeba4..9ee5b49 100644
--- a/include/llvm/Transforms/Scalar.h
+++ b/include/llvm/Transforms/Scalar.h
@@ -354,6 +354,12 @@ FunctionPass *createLowerExpectIntrinsicPass();
 //
 FunctionPass *createPartiallyInlineLibCallsPass();

+//===----------------------------------------------------------------------===//
+//
+// AutoProfilePass - Loads profile data from disk and generates
+// IR metadata to reflect the profile.
+FunctionPass *createAutoProfilePass();
+
 } // End llvm namespace

 #endif
diff --git a/lib/Transforms/Scalar/AutoProfile.cpp b/lib/Transforms/Scalar/AutoProfile.cpp
new file mode 100644
index 0000000..7a63409
--- /dev/null
+++ b/lib/Transforms/Scalar/AutoProfile.cpp
@@ -0,0 +1,363 @@
+//===- AutoProfile.cpp - Incorporate an external profile into the IR ------===//
+//
+//                      The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements the Auto Profile transformation. This pass reads a
+// profile file generated by an external profiling source and generates IR
+// metadata to reflect the profile information in the given profile.
+//
+// TODO(dnovillo) Add more.
+//
+//===----------------------------------------------------------------------===//
+
+#define DEBUG_TYPE "auto-profile"
+
+#include <cstdlib>
+
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/OwningPtr.h"
+#include "llvm/ADT/StringMap.h"
+#include "llvm/DebugInfo/DIContext.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/Function.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/LLVMContext.h"
+#include "llvm/IR/Metadata.h"
+#include "llvm/IR/Module.h"
+#include "llvm/Pass.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/Support/Debug.h"
+#include "llvm/Support/InstIterator.h"
+#include "llvm/Support/MemoryBuffer.h"
+#include "llvm/Support/Regex.h"
+#include "llvm/Support/raw_ostream.h"
+#include "llvm/Transforms/Scalar.h"
+
+using namespace llvm;
+
+// command line option for loading path profiles
+static cl::opt<std::string> AutoProfileFilename(
+    "auto-profile-file", cl::init("autoprof.llvm"), cl::value_desc("filename"),
+    cl::desc("Profile file loaded by -auto-profile"), cl::Hidden);
+
+namespace {
+
+// Base profiler abstract class. This defines the abstract interface
+// that every profiler should respond to.
+//
+// TODO(dnovillo) - Eventually this class ought to move to a separate file.
+// There will be several types of profile loaders. Having them all together in
+// this file will get pretty messy.
+class AutoProfiler {
+public:
+  AutoProfiler(std::string filename) : filename_(filename) {}
+  ~AutoProfiler() {}
+
+  // Load the profile from a file in the native format of this profile.
+  virtual bool loadNative() = 0;
+
+  // Load the profile from a text file.
+  virtual bool loadText() = 0;
+
+  // Dump this profile on stderr.
+  virtual void dump() = 0;
+
+  // Modify the IR with annotations corresponding to the loaded profile.
+  virtual bool emitAnnotations(Function &F) = 0;
+
+  // Instantiate an auto-profiler object based on the detected format of the
+  // give file name. Set *is_text to true, if the file is in text format.
+  static AutoProfiler *
+  instantiateProfiler(const std::string filename, bool *is_text);
+
+protected:
+  // Path name to the file holding the profile data.
+  std::string filename_;
+};
+
+
+// Sample-based profiler. These profiles contain execution frequency
+// information on the function bodies of the program.
+class AutoProfileSampleBased : public AutoProfiler {
+public:
+  AutoProfileSampleBased(std::string filename)
+      : AutoProfiler(filename), profiles_(0) {}
+
+  // Metadata kind for autoprofile.samples.
+  static unsigned AutoProfileSamplesMDKind;
+
+  virtual void dump();
+  virtual bool loadText();
+  virtual bool loadNative() { llvm_unreachable("not implemented"); }
+  virtual bool emitAnnotations(Function &F);
+
+  void dumpFunctionProfile(StringRef fn_name);
+
+protected:
+  typedef DenseMap<uint32_t, uint32_t> BodySampleMap;
+
+  struct FunctionProfile {
+    // Total number of samples collected inside this function. Samples
+    // are cumulative, they include all the samples collected inside
+    // this function and all its inlined callees.
+    unsigned TotalSamples;
+
+    // Total number of samples collected at the head of the function.
+    unsigned TotalHeadSamples;
+
+    // Map <line offset, samples> of line offset to samples collected
+    // inside the function. Each entry in this map contains the number
+    // of samples collected at the corresponding line offset. All line
+    // locations are an offset from the start of the function.
+    BodySampleMap BodySamples;
+  };
+
+  typedef StringMap<FunctionProfile> FunctionProfileMap;
+
+  FunctionProfileMap profiles_;
+};
+
+// Loader class for text-based profiles. These are mostly useful to
+// generate unit tests and not much else.
+class AutoProfileTextLoader {
+public:
+  AutoProfileTextLoader(std::string filename) : filename_(filename) {
+    error_code ec;
+    ec = MemoryBuffer::getFile(filename_, buffer_);
+    if (ec)
+      report_fatal_error("Could not open profile file " + filename_ + ": " +
+                         ec.message());
+    fp_ = buffer_->getBufferStart();
+    linenum_ = 0;
+  }
+
+  // Read a line from the mapped file. Update the current line and file pointer.
+  StringRef readLine() {
+    size_t length = 0;
+    const char *start = fp_;
+    while (fp_ != buffer_->getBufferEnd() && *fp_ != '\n') {
+      length++;
+      fp_++;
+    }
+    if (fp_ != buffer_->getBufferEnd())
+      fp_++;
+    linenum_++;
+    return StringRef(start, length);
+  }
+
+  // Return true, if we've reached EOF.
+  bool atEOF() const {
+    return fp_ == buffer_->getBufferEnd();
+  }
+
+  void reportParseError(std::string msg) const {
+    // TODO(dnovillo) - This is almost certainly the wrong way to emit
+    // diagnostics and exit the compiler.
+    errs() << filename_ << ":" << linenum_ << ": " << msg << "\n";
+    exit(1);
+  }
+
+private:
+  OwningPtr<MemoryBuffer> buffer_;
+  const char *fp_;
+  size_t linenum_;
+  std::string filename_;
+};
+
+// Auto profile pass. This pass reads profile data from the file specified
+// by -auto-profile-file and annotates every affected function with the
+// profile information found in that file.
+class AutoProfile : public FunctionPass {
+public:
+  // Class identification, replacement for typeinfo
+  static char ID;
+
+  AutoProfile() : FunctionPass(ID), profiler_(0) {
+    initializeAutoProfilePass(*PassRegistry::getPassRegistry());
+  }
+
+  virtual bool doInitialization(Module &M);
+
+  ~AutoProfile() { delete profiler_; }
+
+  void dump() { profiler_->dump(); }
+
+  virtual const char *getPassName() const {
+    return "Auto profile pass";
+  }
+
+  virtual bool runOnFunction(Function &F);
+
+  virtual void getAnalysisUsage(AnalysisUsage &AU) const {
+    AU.setPreservesCFG();
+  }
+
+  bool loadProfile();
+
+protected:
+  AutoProfiler *profiler_;
+};
+}
+
+// Dump the sample profile for the given function.
+void AutoProfileSampleBased::dumpFunctionProfile(StringRef fn_name) {
+  FunctionProfile fn_profile = profiles_[fn_name];
+  errs() << "Function: " << fn_name << ", " << fn_profile.TotalSamples << ", "
+         << fn_profile.TotalHeadSamples << ", " << fn_profile.BodySamples.size()
+         << " sampled lines\n";
+  for (BodySampleMap::const_iterator si = fn_profile.BodySamples.begin();
+       si != fn_profile.BodySamples.end(); si++)
+    errs() << "\tline offset: " << si->first
+           << ", number of samples: " << si->second << "\n";
+  errs() << "\n";
+}
+
+// Dump all the collected function profiles.
+void AutoProfileSampleBased::dump() {
+  FunctionProfileMap::const_iterator it;
+  for (it = profiles_.begin(); it != profiles_.end(); it++)
+    dumpFunctionProfile(it->getKey());
+}
+
+// Load a sample profile from a text file.
+bool AutoProfileSampleBased::loadText() {
+  AutoProfileTextLoader loader(filename_);
+
+  // Read the symbol table.
+  std::string line = loader.readLine().str();
+  if (line != "symbol table")
+    loader.reportParseError("Expected 'symbol table', found " + line);
+  Regex num("[0-9]+");
+  line = loader.readLine().str();
+  if (!num.match(line))
+    loader.reportParseError("Expected a number, found " + line);
+  int num_symbols = atoi(line.c_str());
+  for (int i = 0; i < num_symbols; i++) {
+    StringRef fn_name = loader.readLine();
+    FunctionProfile &fn_profile = profiles_[fn_name];
+    fn_profile.BodySamples.clear();
+    fn_profile.TotalSamples = 0;
+    fn_profile.TotalHeadSamples = 0;
+  }
+
+  // Read the profile of each function. Since each function may be
+  // mentioned more than once, and we are collecting flat profiles,
+  // accumulate samples as we parse them.
+  while (!loader.atEOF()) {
+    SmallVector<StringRef, 4> matches;
+    Regex head_re("^([^:]+):([0-9]+):([0-9]+):([0-9]+)$");
+    line = loader.readLine().str();
+    if (!head_re.match(line, &matches))
+      loader.reportParseError("Expected 'mangled_name:NUM:NUM:NUM', found " +
+                              line);
+    assert(matches.size() == 5);
+    StringRef fn_name = matches[1];
+    unsigned num_samples = atoi(matches[2].str().c_str());
+    unsigned num_head_samples = atoi(matches[3].str().c_str());
+    unsigned num_sampled_lines = atoi(matches[4].str().c_str());
+    FunctionProfile &fn_profile = profiles_[fn_name];
+    fn_profile.TotalSamples += num_samples;
+    fn_profile.TotalHeadSamples += num_head_samples;
+    BodySampleMap &sample_map = fn_profile.BodySamples;
+    unsigned i;
+    for (i = 0; i < num_sampled_lines && !loader.atEOF(); i++) {
+      Regex line_sample("^([0-9]+): ([0-9]+)$");
+      line = loader.readLine().str();
+      if (!line_sample.match(line, &matches))
+        loader.reportParseError("Expected 'NUM: NUM', found " + line);
+      assert(matches.size() == 3);
+      unsigned line_offset = atoi(matches[1].str().c_str());
+      unsigned num_samples = atoi(matches[2].str().c_str());
+      sample_map[line_offset] += num_samples;
+    }
+
+    if (i < num_sampled_lines)
+      loader.reportParseError("Unexpected end of file");
+  }
+
+  return true;
+}
+
+// Annotate function F with the contents of the profile.
+bool AutoProfileSampleBased::emitAnnotations(Function &F) {
+  bool changed = false;
+  StringRef name = F.getName();
+  FunctionProfile &fn_profile = profiles_[name];
+  BodySampleMap &body_samples = fn_profile.BodySamples;
+  Instruction &first_inst = *(inst_begin(F));
+  unsigned first_line = first_inst.getDebugLoc().getLine();
+  LLVMContext &context = first_inst.getContext();
+  for (inst_iterator i = inst_begin(F); i != inst_end(F); ++i) {
+    Instruction &inst = *i;
+    const DebugLoc &dloc = inst.getDebugLoc();
+    unsigned loc_offset = dloc.getLine() - first_line + 1;
+    if (body_samples.find(loc_offset) != body_samples.end()) {
+      SmallVector<Value *, 1> sample_values;
+      sample_values.push_back(ConstantInt::get(Type::getInt32Ty(context),
+                                               body_samples[loc_offset]));
+      MDNode *md = MDNode::get(context, sample_values);
+      inst.setMetadata(AutoProfileSamplesMDKind, md);
+      changed = true;
+    }
+  }
+
+  DEBUG(if (changed) {
+              dbgs() << "\n\nInstructions changed in " << name << "\n";
+              for (inst_iterator i = inst_begin(F); i != inst_end(F); ++i) {
+                Instruction &inst = *i;
+                MDNode *md = inst.getMetadata(AutoProfileSamplesMDKind);
+                if (md) {
+                  assert(md->getNumOperands() == 1);
+                  ConstantInt *val = dyn_cast<ConstantInt>(md->getOperand(0));
+                  dbgs() << inst << " (" << val->getValue().getZExtValue()
+                         << " samples)\n";
+                }
+              }
+            });
+
+  return changed;
+}
+
+AutoProfiler *AutoProfiler::instantiateProfiler(const std::string filename,
+                                                bool *is_text) {
+  // TODO(dnovillo) - Implement file type detection and return the appropriate
+  // AutoProfiler sub-class instance.
+  *is_text = true;
+  return new AutoProfileSampleBased(filename);
+}
+
+unsigned AutoProfileSampleBased::AutoProfileSamplesMDKind = 0;
+char AutoProfile::ID = 0;
+INITIALIZE_PASS(AutoProfile, "auto-profile", "Auto Profile loader", false,
+                false)
+
+bool AutoProfile::runOnFunction(Function& F) {
+  return profiler_->emitAnnotations(F);
+}
+
+bool AutoProfile::loadProfile() {
+  bool is_text;
+  profiler_ =
+      AutoProfiler::instantiateProfiler(AutoProfileFilename, &is_text);
+  if (!profiler_)
+    return false;
+
+  return (is_text) ? profiler_->loadText() : profiler_->loadNative();
+}
+
+bool AutoProfile::doInitialization(Module &M) {
+  if (!loadProfile())
+    return false;
+  AutoProfileSampleBased::AutoProfileSamplesMDKind =
+      M.getContext().getMDKindID("autoprofile.samples");
+  return true;
+}
+
+FunctionPass *llvm::createAutoProfilePass() {
+  return new AutoProfile();
+}
diff --git a/lib/Transforms/Scalar/CMakeLists.txt b/lib/Transforms/Scalar/CMakeLists.txt
index 3b89fd4..1093ef0 100644
--- a/lib/Transforms/Scalar/CMakeLists.txt
+++ b/lib/Transforms/Scalar/CMakeLists.txt
@@ -1,5 +1,6 @@
 add_llvm_library(LLVMScalarOpts
   ADCE.cpp
+  AutoProfile.cpp
   CodeGenPrepare.cpp
   ConstantProp.cpp
   CorrelatedValuePropagation.cpp
diff --git a/lib/Transforms/Scalar/Scalar.cpp b/lib/Transforms/Scalar/Scalar.cpp
index 0c3ffbc..542aa33 100644
--- a/lib/Transforms/Scalar/Scalar.cpp
+++ b/lib/Transforms/Scalar/Scalar.cpp
@@ -28,6 +28,7 @@ using namespace llvm;
 /// ScalarOpts library.
 void llvm::initializeScalarOpts(PassRegistry &Registry) {
   initializeADCEPass(Registry);
+  initializeAutoProfilePass(Registry);
   initializeCodeGenPreparePass(Registry);
   initializeConstantPropagationPass(Registry);
   initializeCorrelatedValuePropagationPass(Registry);
--
1.8.4

_______________________________________________
llvm-commits mailing list
llvm-commits at cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No:  2548782





More information about the llvm-commits mailing list