[LLVMdev] Reviewer for our Path Profiling Implementation

Andrew Trick atrick at apple.com
Tue Dec 7 16:09:48 PST 2010


On Dec 3, 2010, at 11:21 AM, Adam Preuss wrote:
> I am a student at the University of Alberta under the  
> supervision of José Nelson Amaral, and I have been working on  
> implementing path profiling into LLVM. I have completed my project
> and would like to submit it.
> 
> We are looking for a reviewer for the path profiling implementation. We
> have sent previous requests to the llvmdev list but have so far been
> unsuccessful.
> 
> Please see the attached patch and associated documentation.
> 
> Adam Preuss
> 
> <path-profiling.patch>
> <doc.pdf>

Hi Adam,

Thanks for making the effort to contribute. I have seen some valid
feedback to the effect that we can't keep something in mainline unless
is it actively used or maintained. However, I'm willing to give you a
review so it can be checked in, with no guarantee that the code won't
be removed if it breaks, or replaced by a different profiler if we
ever adopt one for active use.

Thanks for the design doc. We'll be sure to get that checked into
llvm.org/pubs. A couple of design questions: Can you compare the
overhead of LLVM's edge profiler vs path profiler, both run time
(profile time) and compile time? How do you anticipate the path
profile to be used by the optimizer?

Regarding the implementation, a couple of style problems should be addressed:

- Tabs absolutely need to be removed:
http://llvm.org/docs/CodingStandards.html#scf_spacestabs

- I see a few violation of 80 cols. Not many, but too many for me
  to list here.

If you think any of my specific suggestions below are off-base, don't
follow them, just respond with some justification. Since you've
already done rigorous verification, and I'm not overly concerned with
optimality, most of my comments are style related.

---
+static cl::opt<std::string>
+EdgeProfileFilename("path-profile-verifier-file",
+	cl::init("edgefrompath.llvmprof.out"),
+	cl::value_desc("filename"),
+  cl::desc("Edge profile file generated by -path-profile-verifier"));
...
+// command line option for loading path profiles
+static cl::opt<std::string>
+PathProfileInfoFilename("path-profile-loader-file", cl::init("llvmprof.out"),
+                        cl::value_desc("filename"),
+                        cl::desc("Path profile file loaded by -path-profile-loader"));
...
+// Are we enabling early termination
+static cl::opt<bool> ProcessEarlyTermination("process-early-termination",
+	cl::desc("In path profiling, insert extra instrumentation to account for "
+           "unexpected function termination."));
...
+// Should we print the dot-graphs
+static cl::opt<bool> DotPathDag("dot-pathdag",
+	cl::desc("Output the path profiling DAG for each function."));

Please add cl::Hidden to all new flags, and give them all
a "path-profile" prefix so they aren't scattered in the hidden help text.

+++ include/llvm/Analysis/ProfileInfoTypes.h

+#define PP_ARRAY 0
+#define PP_HASH  1

Why not an enum?

+typedef struct {
+  unsigned fnNumber; /* function number for these counters */
+  unsigned numEntries;   /* number of entries stored */
+} PathHeader;
+
+typedef struct {
+  unsigned pathNumber;
+  unsigned pathCounter;
+} PathTableEntry;

Being global types, they need a PathProfiling-specific name or prefix
(there are other kinds of paths). To be proper, should they also be under:
#if defined(__cplusplus)
 extern "C" {
endif

You may even want to remove the emacs hint "-*- C++ -*-" from both
this header and Profiling.h

+++ runtime/libprofile/Profiling.h

+#include <stdint.h>
+#include <unistd.h>

Include these only in the .cpp files in which they're needed. Include
them after any non-system headers.

+typedef int PType; /* type for storing enum ProfilingType on disk */

I realize PType is currently only included within libprofile, but the
general guideline is to make type names in the global namespace
descriptive. Such as ProfileInfoStorageType.

I'm not sure why you made it a global type. Was it for use by the
profile loader? If so, I think it would need to be in
ProfileInfoTypes.h under extern "C".

+++ runtime/libprofile/CommonProfiling.c

+static int OutFile = -1;

I'm not sure why OutFile cannot remain a local static within
getOutFile.

+  int res;
...
+  res = write(outFile, &PTy, sizeof( Profile));

Some builds may warn of the unused result. If you're not checking the
result, you probably shouldn't retrieve it.

+++ runtime/libprofile/PathProfiling.c

+ inline uint32_t* getPathCounter(uint32_t functionNumber, uint32_t pathNumber) {
...
+	if( ((ftEntry_t*)ft)[functionNumber-1].array == 0)
+		((ftEntry_t*)ft)[functionNumber-1].array =
calloc(sizeof(pathHashTable_t), 1);

It's not immediately obvious to me why you always cast "ft".

Since this is a runtime lib, it may be wise to add array bounds checks.

In getPathCounter can you check that type == PP_HASH and size == 0
before writing to the data structure?

+++ lib/Transforms/Instrumentation/ProfilingUtils.h	(working copy)

+#include "llvm/DerivedTypes.h"

Add the include only in the .cpp files that need it.

+++ include/llvm/Analysis/PathNumbering.h

+namespace llvm {
+	class BallLarusNode;

I don't think you should indent large namespaces bodies. Same goes for
the rest of the files.

+++ lib/Analysis/PathNumbering.cpp

The constructor and accessors are trivial and can be defined inline.

+ void BallLarusDag::calculatePathNumbersFrom(BallLarusNode* node) {
...
+      BallLarusEdge* currEdge = *succ;
+      currEdge->setWeight(sumPaths);
+      succNode = currEdge->getTarget();
+      unsigned succPaths = succNode->getNumberPaths();
+      isReady = isReady && (succPaths != 0);

If a successor is not finished, can you early return here instead of
prematurely setting the edge weights? Better yet, keep a count
of the remaining successors, then only call calculatePathNumberFrom()
once per node after all successors are visited? You already visiting
each pred edge after finishing a node, just decrement its remaining
count at that time.

+++ lib/Transforms/Instrumentation/PathProfiling.cpp

+ void PathProfiler::insertInstrumentationStartingAt(BLInstrumentationEdge*

Does this need to be recursive? If you simply visit the edges in dfs
order (creation order?) would that be sufficient? I don't think you
even need to create RPO order, since phis can be lazily prepared.

+++ include/llvm/Analysis/PathProfileInfo.h

+ #ifndef LLVM_ANALYSIS_BALLLARUSPATHPROFILEINFO_H

Symbol doesn't really need to be this long.

+#include <stack>
+#include "llvm/BasicBlock.h"
+#include "llvm/Analysis/PathNumbering.h"

Put system headers after user headers.

+namespace llvm {
+	class Path;

A class Path already exists. You just got lucky it's in llvm::sys. Please
wrap your classes in a namespace (e.g. llvm::PathProfile) or give
them fully descriptive names.

+++ include/llvm/Analysis/PathProfileInfo.cpp

+unsigned PathEdge::getDuplicateNumber() {
+	return _duplicateNumber;
+}
+
+BasicBlock* PathEdge::getSource() {
+	return _source;
+}
+
+BasicBlock* PathEdge::getTarget() {
+	return _target;
+}
...

Trivial accessors should be defined inline.

-Andy



More information about the llvm-dev mailing list