Automatic PGO - Initial implementation (1/N)
Eric Christopher
echristo at gmail.com
Wed Sep 25 09:18:28 PDT 2013
On Wed, Sep 25, 2013 at 9:11 AM, Evan Cheng <evan.cheng at apple.com> wrote:
> Hi Diego,
>
> I am curious. Why is this called automatic PGO? Why is it automatic?
Because it doesn't require instrumentation. :)
Better names welcome. Maybe "sample pgo"? *shrug* Auto just sounds cooler.
-eric
>
> Also, why is the auto-profile part done as a scalar transformation pass? Did you consider alternatives?
>
> Thanks,
>
> Evan
>
> Sent from my iPad
>
>> On Sep 24, 2013, at 7:07 AM, Diego Novillo <dnovillo at google.com> wrote:
>>
>> This adds two options: -auto-profile and -auto-profile-file. The new
>> option causes the compiler to read a profile file and emit IR
>> metadata reflecting that profile.
>>
>> The profile file is assumed to have been generated by an external
>> profile source. The profile information is converted into IR metadata,
>> which is later used by the analysis routines to estimate block
>> frequencies, edge weights and other related data.
>>
>> External profile information files have no fixed format, each profiler
>> is free to define its own. This includes both the on-disk representation
>> of the profile and the kind of profile information stored in the file.
>> A common kind of profile is based on sampling (e.g., perf), which
>> essentially counts how many times each line of the program has been
>> executed during the run.
>>
>> The only requirements is that each profiler must provide a way to
>> load its own profile format into internal data structures and
>> then a method to convert that data into IR annotations.
>>
>> The AutoProfile pass is organized as a scalar transformation. On
>> startup, it reads the file given in -auto-profile-file to determine what
>> kind of profile it contains. This file is assumed to contain profile
>> information for the whole application. The profile data in the file is
>> read and incorporated into the internal state of the corresponding
>> profiler.
>>
>> To facilitate testing, I've organized the profilers to support two file
>> formats: text and native. The native format is whatever on-disk
>> representation the profiler wants to support, I think this will mostly
>> be bitcode files, but it could be anything the profiler wants to
>> support. To do this, every profiler must implement the
>> AutoProfiler::loadNative() function.
>>
>> The text format is mostly meant for debugging. Records are separated by
>> newlines, but each profiler is free to interpret records as it sees fit.
>> Profilers must implement the AutoProfiler::loadText() function.
>>
>> Finally, the pass will call AutoProfiler::emitAnnotations() for each
>> function in the current translation unit. This function needs to
>> translate the loaded profile into IR metadata, which the analyzer will
>> later be able to use.
>>
>> This patch implements the first steps towards the above design. I've
>> implemented a sample-based flat profiler. The format of the profile is
>> fairly simplistic. Each sampled function contains a list of relative
>> line locations (from the start of the function) together with a count
>> representing how many samples were collected at that line during
>> execution. I generate this profile using perf and a separate converter
>> tool.
>>
>> Currently, I have only implemented a text format for these profiles. I
>> am interested in initial feedback to the whole approach before I send
>> the other parts of the implementation for review.
>>
>> This patch implements:
>>
>> - The AutoProfile pass.
>> - The base AutoProfiler class with the core interface.
>> - A SampleBasedProfiler using the above interface. The profiler
>> generates metadata autoprofile.samples on every IR instruction that
>> matches the profiles.
>> - A text loader class to assist the implementation of
>> AutoProfiler::loadText().
>>
>> Caveats and questions:
>>
>> 1- I am almost certainly using the wrong APIs or using the right
>> APIs in unorthodox ways. Please point me to better
>> alternatives.
>>
>> 2- I was surprised to learn that line number information is not
>> transferred into the IR unless we are emitting debug
>> information. For sample-based profiling, I'm going to need
>> line number information generated by the front-end
>> independently of debug info. Eric, is that possible?
>>
>> 3- I have not included in this patch changes to the analyzer. I
>> want to keep it focused to the profile loading and IR
>> annotation. In the analyzer, we will have propagation of
>> attributes and other fixes (e.g., from the samples it is
>> possible to have instructions on the same basic block
>> registered with differing number of samples). I also have not
>> included changes to code motion to get rid of the autoprofile
>> information.
>>
>> 4- I need to add test cases.
>>
>> Mainly, I'm interested in making sure that this direction is
>> generally useful. I haven't given a lot of thought to other types
>> of profiling, but I'm certain any kind of tracing or other
>> execution frequency profiles can be adapted. Things like value
>> profiling may be a bit more involved, but mostly because I'm not
>> sure how the type and symbols are tracked in LLVM.
>>
>> Thanks. Diego.
>> ---
>> include/llvm/InitializePasses.h | 1 +
>> include/llvm/Transforms/Scalar.h | 6 +
>> lib/Transforms/Scalar/AutoProfile.cpp | 363 ++++++++++++++++++++++++++++++++++
>> lib/Transforms/Scalar/CMakeLists.txt | 1 +
>> lib/Transforms/Scalar/Scalar.cpp | 1 +
>> 5 files changed, 372 insertions(+)
>> create mode 100644 lib/Transforms/Scalar/AutoProfile.cpp
>>
>> diff --git a/include/llvm/InitializePasses.h b/include/llvm/InitializePasses.h
>> index 1b50bb2..c922228 100644
>> --- a/include/llvm/InitializePasses.h
>> +++ b/include/llvm/InitializePasses.h
>> @@ -70,6 +70,7 @@ void initializeAliasDebuggerPass(PassRegistry&);
>> void initializeAliasSetPrinterPass(PassRegistry&);
>> void initializeAlwaysInlinerPass(PassRegistry&);
>> void initializeArgPromotionPass(PassRegistry&);
>> +void initializeAutoProfilePass(PassRegistry&);
>> void initializeBarrierNoopPass(PassRegistry&);
>> void initializeBasicAliasAnalysisPass(PassRegistry&);
>> void initializeBasicCallGraphPass(PassRegistry&);
>> diff --git a/include/llvm/Transforms/Scalar.h b/include/llvm/Transforms/Scalar.h
>> index 51aeba4..9ee5b49 100644
>> --- a/include/llvm/Transforms/Scalar.h
>> +++ b/include/llvm/Transforms/Scalar.h
>> @@ -354,6 +354,12 @@ FunctionPass *createLowerExpectIntrinsicPass();
>> //
>> FunctionPass *createPartiallyInlineLibCallsPass();
>>
>> +//===----------------------------------------------------------------------===//
>> +//
>> +// AutoProfilePass - Loads profile data from disk and generates
>> +// IR metadata to reflect the profile.
>> +FunctionPass *createAutoProfilePass();
>> +
>> } // End llvm namespace
>>
>> #endif
>> diff --git a/lib/Transforms/Scalar/AutoProfile.cpp b/lib/Transforms/Scalar/AutoProfile.cpp
>> new file mode 100644
>> index 0000000..7a63409
>> --- /dev/null
>> +++ b/lib/Transforms/Scalar/AutoProfile.cpp
>> @@ -0,0 +1,363 @@
>> +//===- AutoProfile.cpp - Incorporate an external profile into the IR ------===//
>> +//
>> +// The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +// This file implements the Auto Profile transformation. This pass reads a
>> +// profile file generated by an external profiling source and generates IR
>> +// metadata to reflect the profile information in the given profile.
>> +//
>> +// TODO(dnovillo) Add more.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#define DEBUG_TYPE "auto-profile"
>> +
>> +#include <cstdlib>
>> +
>> +#include "llvm/ADT/DenseMap.h"
>> +#include "llvm/ADT/OwningPtr.h"
>> +#include "llvm/ADT/StringMap.h"
>> +#include "llvm/DebugInfo/DIContext.h"
>> +#include "llvm/IR/Constants.h"
>> +#include "llvm/IR/Function.h"
>> +#include "llvm/IR/Instructions.h"
>> +#include "llvm/IR/LLVMContext.h"
>> +#include "llvm/IR/Metadata.h"
>> +#include "llvm/IR/Module.h"
>> +#include "llvm/Pass.h"
>> +#include "llvm/Support/CommandLine.h"
>> +#include "llvm/Support/Debug.h"
>> +#include "llvm/Support/InstIterator.h"
>> +#include "llvm/Support/MemoryBuffer.h"
>> +#include "llvm/Support/Regex.h"
>> +#include "llvm/Support/raw_ostream.h"
>> +#include "llvm/Transforms/Scalar.h"
>> +
>> +using namespace llvm;
>> +
>> +// command line option for loading path profiles
>> +static cl::opt<std::string> AutoProfileFilename(
>> + "auto-profile-file", cl::init("autoprof.llvm"), cl::value_desc("filename"),
>> + cl::desc("Profile file loaded by -auto-profile"), cl::Hidden);
>> +
>> +namespace {
>> +
>> +// Base profiler abstract class. This defines the abstract interface
>> +// that every profiler should respond to.
>> +//
>> +// TODO(dnovillo) - Eventually this class ought to move to a separate file.
>> +// There will be several types of profile loaders. Having them all together in
>> +// this file will get pretty messy.
>> +class AutoProfiler {
>> +public:
>> + AutoProfiler(std::string filename) : filename_(filename) {}
>> + ~AutoProfiler() {}
>> +
>> + // Load the profile from a file in the native format of this profile.
>> + virtual bool loadNative() = 0;
>> +
>> + // Load the profile from a text file.
>> + virtual bool loadText() = 0;
>> +
>> + // Dump this profile on stderr.
>> + virtual void dump() = 0;
>> +
>> + // Modify the IR with annotations corresponding to the loaded profile.
>> + virtual bool emitAnnotations(Function &F) = 0;
>> +
>> + // Instantiate an auto-profiler object based on the detected format of the
>> + // give file name. Set *is_text to true, if the file is in text format.
>> + static AutoProfiler *
>> + instantiateProfiler(const std::string filename, bool *is_text);
>> +
>> +protected:
>> + // Path name to the file holding the profile data.
>> + std::string filename_;
>> +};
>> +
>> +
>> +// Sample-based profiler. These profiles contain execution frequency
>> +// information on the function bodies of the program.
>> +class AutoProfileSampleBased : public AutoProfiler {
>> +public:
>> + AutoProfileSampleBased(std::string filename)
>> + : AutoProfiler(filename), profiles_(0) {}
>> +
>> + // Metadata kind for autoprofile.samples.
>> + static unsigned AutoProfileSamplesMDKind;
>> +
>> + virtual void dump();
>> + virtual bool loadText();
>> + virtual bool loadNative() { llvm_unreachable("not implemented"); }
>> + virtual bool emitAnnotations(Function &F);
>> +
>> + void dumpFunctionProfile(StringRef fn_name);
>> +
>> +protected:
>> + typedef DenseMap<uint32_t, uint32_t> BodySampleMap;
>> +
>> + struct FunctionProfile {
>> + // Total number of samples collected inside this function. Samples
>> + // are cumulative, they include all the samples collected inside
>> + // this function and all its inlined callees.
>> + unsigned TotalSamples;
>> +
>> + // Total number of samples collected at the head of the function.
>> + unsigned TotalHeadSamples;
>> +
>> + // Map <line offset, samples> of line offset to samples collected
>> + // inside the function. Each entry in this map contains the number
>> + // of samples collected at the corresponding line offset. All line
>> + // locations are an offset from the start of the function.
>> + BodySampleMap BodySamples;
>> + };
>> +
>> + typedef StringMap<FunctionProfile> FunctionProfileMap;
>> +
>> + FunctionProfileMap profiles_;
>> +};
>> +
>> +// Loader class for text-based profiles. These are mostly useful to
>> +// generate unit tests and not much else.
>> +class AutoProfileTextLoader {
>> +public:
>> + AutoProfileTextLoader(std::string filename) : filename_(filename) {
>> + error_code ec;
>> + ec = MemoryBuffer::getFile(filename_, buffer_);
>> + if (ec)
>> + report_fatal_error("Could not open profile file " + filename_ + ": " +
>> + ec.message());
>> + fp_ = buffer_->getBufferStart();
>> + linenum_ = 0;
>> + }
>> +
>> + // Read a line from the mapped file. Update the current line and file pointer.
>> + StringRef readLine() {
>> + size_t length = 0;
>> + const char *start = fp_;
>> + while (fp_ != buffer_->getBufferEnd() && *fp_ != '\n') {
>> + length++;
>> + fp_++;
>> + }
>> + if (fp_ != buffer_->getBufferEnd())
>> + fp_++;
>> + linenum_++;
>> + return StringRef(start, length);
>> + }
>> +
>> + // Return true, if we've reached EOF.
>> + bool atEOF() const {
>> + return fp_ == buffer_->getBufferEnd();
>> + }
>> +
>> + void reportParseError(std::string msg) const {
>> + // TODO(dnovillo) - This is almost certainly the wrong way to emit
>> + // diagnostics and exit the compiler.
>> + errs() << filename_ << ":" << linenum_ << ": " << msg << "\n";
>> + exit(1);
>> + }
>> +
>> +private:
>> + OwningPtr<MemoryBuffer> buffer_;
>> + const char *fp_;
>> + size_t linenum_;
>> + std::string filename_;
>> +};
>> +
>> +// Auto profile pass. This pass reads profile data from the file specified
>> +// by -auto-profile-file and annotates every affected function with the
>> +// profile information found in that file.
>> +class AutoProfile : public FunctionPass {
>> +public:
>> + // Class identification, replacement for typeinfo
>> + static char ID;
>> +
>> + AutoProfile() : FunctionPass(ID), profiler_(0) {
>> + initializeAutoProfilePass(*PassRegistry::getPassRegistry());
>> + }
>> +
>> + virtual bool doInitialization(Module &M);
>> +
>> + ~AutoProfile() { delete profiler_; }
>> +
>> + void dump() { profiler_->dump(); }
>> +
>> + virtual const char *getPassName() const {
>> + return "Auto profile pass";
>> + }
>> +
>> + virtual bool runOnFunction(Function &F);
>> +
>> + virtual void getAnalysisUsage(AnalysisUsage &AU) const {
>> + AU.setPreservesCFG();
>> + }
>> +
>> + bool loadProfile();
>> +
>> +protected:
>> + AutoProfiler *profiler_;
>> +};
>> +}
>> +
>> +// Dump the sample profile for the given function.
>> +void AutoProfileSampleBased::dumpFunctionProfile(StringRef fn_name) {
>> + FunctionProfile fn_profile = profiles_[fn_name];
>> + errs() << "Function: " << fn_name << ", " << fn_profile.TotalSamples << ", "
>> + << fn_profile.TotalHeadSamples << ", " << fn_profile.BodySamples.size()
>> + << " sampled lines\n";
>> + for (BodySampleMap::const_iterator si = fn_profile.BodySamples.begin();
>> + si != fn_profile.BodySamples.end(); si++)
>> + errs() << "\tline offset: " << si->first
>> + << ", number of samples: " << si->second << "\n";
>> + errs() << "\n";
>> +}
>> +
>> +// Dump all the collected function profiles.
>> +void AutoProfileSampleBased::dump() {
>> + FunctionProfileMap::const_iterator it;
>> + for (it = profiles_.begin(); it != profiles_.end(); it++)
>> + dumpFunctionProfile(it->getKey());
>> +}
>> +
>> +// Load a sample profile from a text file.
>> +bool AutoProfileSampleBased::loadText() {
>> + AutoProfileTextLoader loader(filename_);
>> +
>> + // Read the symbol table.
>> + std::string line = loader.readLine().str();
>> + if (line != "symbol table")
>> + loader.reportParseError("Expected 'symbol table', found " + line);
>> + Regex num("[0-9]+");
>> + line = loader.readLine().str();
>> + if (!num.match(line))
>> + loader.reportParseError("Expected a number, found " + line);
>> + int num_symbols = atoi(line.c_str());
>> + for (int i = 0; i < num_symbols; i++) {
>> + StringRef fn_name = loader.readLine();
>> + FunctionProfile &fn_profile = profiles_[fn_name];
>> + fn_profile.BodySamples.clear();
>> + fn_profile.TotalSamples = 0;
>> + fn_profile.TotalHeadSamples = 0;
>> + }
>> +
>> + // Read the profile of each function. Since each function may be
>> + // mentioned more than once, and we are collecting flat profiles,
>> + // accumulate samples as we parse them.
>> + while (!loader.atEOF()) {
>> + SmallVector<StringRef, 4> matches;
>> + Regex head_re("^([^:]+):([0-9]+):([0-9]+):([0-9]+)$");
>> + line = loader.readLine().str();
>> + if (!head_re.match(line, &matches))
>> + loader.reportParseError("Expected 'mangled_name:NUM:NUM:NUM', found " +
>> + line);
>> + assert(matches.size() == 5);
>> + StringRef fn_name = matches[1];
>> + unsigned num_samples = atoi(matches[2].str().c_str());
>> + unsigned num_head_samples = atoi(matches[3].str().c_str());
>> + unsigned num_sampled_lines = atoi(matches[4].str().c_str());
>> + FunctionProfile &fn_profile = profiles_[fn_name];
>> + fn_profile.TotalSamples += num_samples;
>> + fn_profile.TotalHeadSamples += num_head_samples;
>> + BodySampleMap &sample_map = fn_profile.BodySamples;
>> + unsigned i;
>> + for (i = 0; i < num_sampled_lines && !loader.atEOF(); i++) {
>> + Regex line_sample("^([0-9]+): ([0-9]+)$");
>> + line = loader.readLine().str();
>> + if (!line_sample.match(line, &matches))
>> + loader.reportParseError("Expected 'NUM: NUM', found " + line);
>> + assert(matches.size() == 3);
>> + unsigned line_offset = atoi(matches[1].str().c_str());
>> + unsigned num_samples = atoi(matches[2].str().c_str());
>> + sample_map[line_offset] += num_samples;
>> + }
>> +
>> + if (i < num_sampled_lines)
>> + loader.reportParseError("Unexpected end of file");
>> + }
>> +
>> + return true;
>> +}
>> +
>> +// Annotate function F with the contents of the profile.
>> +bool AutoProfileSampleBased::emitAnnotations(Function &F) {
>> + bool changed = false;
>> + StringRef name = F.getName();
>> + FunctionProfile &fn_profile = profiles_[name];
>> + BodySampleMap &body_samples = fn_profile.BodySamples;
>> + Instruction &first_inst = *(inst_begin(F));
>> + unsigned first_line = first_inst.getDebugLoc().getLine();
>> + LLVMContext &context = first_inst.getContext();
>> + for (inst_iterator i = inst_begin(F); i != inst_end(F); ++i) {
>> + Instruction &inst = *i;
>> + const DebugLoc &dloc = inst.getDebugLoc();
>> + unsigned loc_offset = dloc.getLine() - first_line + 1;
>> + if (body_samples.find(loc_offset) != body_samples.end()) {
>> + SmallVector<Value *, 1> sample_values;
>> + sample_values.push_back(ConstantInt::get(Type::getInt32Ty(context),
>> + body_samples[loc_offset]));
>> + MDNode *md = MDNode::get(context, sample_values);
>> + inst.setMetadata(AutoProfileSamplesMDKind, md);
>> + changed = true;
>> + }
>> + }
>> +
>> + DEBUG(if (changed) {
>> + dbgs() << "\n\nInstructions changed in " << name << "\n";
>> + for (inst_iterator i = inst_begin(F); i != inst_end(F); ++i) {
>> + Instruction &inst = *i;
>> + MDNode *md = inst.getMetadata(AutoProfileSamplesMDKind);
>> + if (md) {
>> + assert(md->getNumOperands() == 1);
>> + ConstantInt *val = dyn_cast<ConstantInt>(md->getOperand(0));
>> + dbgs() << inst << " (" << val->getValue().getZExtValue()
>> + << " samples)\n";
>> + }
>> + }
>> + });
>> +
>> + return changed;
>> +}
>> +
>> +AutoProfiler *AutoProfiler::instantiateProfiler(const std::string filename,
>> + bool *is_text) {
>> + // TODO(dnovillo) - Implement file type detection and return the appropriate
>> + // AutoProfiler sub-class instance.
>> + *is_text = true;
>> + return new AutoProfileSampleBased(filename);
>> +}
>> +
>> +unsigned AutoProfileSampleBased::AutoProfileSamplesMDKind = 0;
>> +char AutoProfile::ID = 0;
>> +INITIALIZE_PASS(AutoProfile, "auto-profile", "Auto Profile loader", false,
>> + false)
>> +
>> +bool AutoProfile::runOnFunction(Function& F) {
>> + return profiler_->emitAnnotations(F);
>> +}
>> +
>> +bool AutoProfile::loadProfile() {
>> + bool is_text;
>> + profiler_ =
>> + AutoProfiler::instantiateProfiler(AutoProfileFilename, &is_text);
>> + if (!profiler_)
>> + return false;
>> +
>> + return (is_text) ? profiler_->loadText() : profiler_->loadNative();
>> +}
>> +
>> +bool AutoProfile::doInitialization(Module &M) {
>> + if (!loadProfile())
>> + return false;
>> + AutoProfileSampleBased::AutoProfileSamplesMDKind =
>> + M.getContext().getMDKindID("autoprofile.samples");
>> + return true;
>> +}
>> +
>> +FunctionPass *llvm::createAutoProfilePass() {
>> + return new AutoProfile();
>> +}
>> diff --git a/lib/Transforms/Scalar/CMakeLists.txt b/lib/Transforms/Scalar/CMakeLists.txt
>> index 3b89fd4..1093ef0 100644
>> --- a/lib/Transforms/Scalar/CMakeLists.txt
>> +++ b/lib/Transforms/Scalar/CMakeLists.txt
>> @@ -1,5 +1,6 @@
>> add_llvm_library(LLVMScalarOpts
>> ADCE.cpp
>> + AutoProfile.cpp
>> CodeGenPrepare.cpp
>> ConstantProp.cpp
>> CorrelatedValuePropagation.cpp
>> diff --git a/lib/Transforms/Scalar/Scalar.cpp b/lib/Transforms/Scalar/Scalar.cpp
>> index 0c3ffbc..542aa33 100644
>> --- a/lib/Transforms/Scalar/Scalar.cpp
>> +++ b/lib/Transforms/Scalar/Scalar.cpp
>> @@ -28,6 +28,7 @@ using namespace llvm;
>> /// ScalarOpts library.
>> void llvm::initializeScalarOpts(PassRegistry &Registry) {
>> initializeADCEPass(Registry);
>> + initializeAutoProfilePass(Registry);
>> initializeCodeGenPreparePass(Registry);
>> initializeConstantPropagationPass(Registry);
>> initializeCorrelatedValuePropagationPass(Registry);
>> --
>> 1.8.4
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list