[PATCH] D77752: [llvm] Machine Learned policy for inlining -Oz

Wed Apr 8 16:19:33 PDT 2020

dblaikie added a comment.

Just practical things (which aren't the point right now given the general design review), but might be handy further down the road.

================
Comment at: llvm/lib/Analysis/InlineCost.cpp:2283
+  if (!R.isSuccess())
+    return {};
+  return CA.getCost();
----------------
Generally prefer llvm::None rather than {} for the not-present Optional state.

================
Comment at: llvm/lib/Analysis/InliningAdvisor.cpp:32-34
+class InliningAdvisorImpl {};
+struct PendingInliningRecordImpl {};
+
----------------
Doesn't look like these two definitions are required.

But more generally - this region of code seems to me like a lot of surface area to #def in and out for the ML parameters.

Does the non-trivial implementation of this pull in problematic dependencies (like LLVM's code ensures it works with or without zlib) or just a little extra insurance that this wouldn't adversely affect people when it's first committed? If it's the latter - I'd probably just have this be a runtime setting (so the functionality can be tested on every buildbot/developer, etc), not a compile time/#ifdef'd out situation.

If it is a zlib-like situation, I'd still be in favor of a smaller surface area for the opt in/out - like having the basic interfaces in one library & then another library that implements the ML implementation - and a factory function that gives you either the trivial implementation or the ML one, etc.

Ah, looking further on, I see this adds a TF dependency (aside: does this make LLVM & TF circularly dependent?) - yeah, then I'd expect the CMake variable would just be "has TF/doesn't have TF" and then maybe an LLVM library that requires TF/doesn't build/link without it (hmm, maybe that's more complicated than what you've got... I don't know for sure) - and then the usage would just be "x = #if TF new TFThing() #elif new NullThing #endif"\, etc.

================
Comment at: llvm/lib/Analysis/ML/IRToNativeSizeLearning.cpp:31-32
+using namespace llvm;
+namespace {
+size_t getSize(Function &F, TargetTransformInfo &TTI) {
+  size_t Ret = 0;
----------------
LLVM style guide encourages using namespace/global-scope static for file-local functions, and only using anonymous namespaces for types: https://llvm.org/docs/CodingStandards.html#anonymous-namespaces

================
Comment at: llvm/lib/Analysis/ML/InliningAdvisor.cpp:480
+FuncDesc InliningAdvisorImpl::getFuncDesc(const Function &F) const {
+  return FuncDescs.getOrInsert(&F, [this](const Function *F) {
+    FuncDesc Ret;
----------------
Generally if you're passing a lambda that'll be only run immediately (or even within the scope it's declared, but not immediately) - I'd suggest using [&] for capture, treating the lambda scope the same as other scopes (like for/while/if/etc) - that implicitly have access to all outer variables

================
Comment at: llvm/lib/Analysis/ML/InliningAdvisor.cpp:492-494
+        if (BI->isConditional()) {
+          Ret.ConditionallyExecutedBlocks += BI->getNumSuccessors();
+        }
----------------
Generally LLVM code doesn't use {} on single line blocks.

================
Comment at: llvm/lib/Analysis/ML/InliningModelFeatureMaps.h:16-32
+enum FeatureList {
+  CalleeBasicBlockCount = 0,
+  CallSiteHeight,
+  NodeCount,
+  NrCtantParams,
+  CostEstimate,
+  EdgeCount,
----------------
Enumerators need a prefix: https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly (or use an enum class, that forces using the enum name as a prefix)

================
Comment at: llvm/lib/Analysis/ML/InliningModelFeatureMaps.h:34-54
+static const std::array<std::string, FeatureList::NumberOfFeatures>
+    FeatureNameMap{
+        /*FeatureList::CalleeBasicBlockCount*/ "callee_basic_block_count",
+        /*FeatureList::CallSiteHeight*/ "callsite_height",
+        /*FeatureList::NodeCount*/ "node_count",
+        /*FeatureList::NrCtantParams*/ "nr_ctant_params",
+        /*FeatureList::CostEstimate*/ "cost_estimate",
----------------
static things in headers are problematic (since they define new/distinct variables in every translation unit that includes the header) - you can use inline accessor functions, or maybe constexpr variables... (I forget/don't know the specific linkage rules there) or declared in the header and defined in a cpp file

================
Comment at: llvm/lib/Analysis/ML/InliningModelRunnerProduction.h:56-63
+class FunctionSizeEstimator {
+public:
+  FunctionSizeEstimator(LLVMContext &Ctx, FunctionAnalysisManager &FAM) {}
+  // Just return the number of blocks. This is interesting for debugging only.
+  size_t getSizeEstimate(const Function &F) { return F.size(); }
+  bool isValid() const { return true; }
+};
----------------
Ah, this is more of the "two implementations of the same entities" discussed earlier - yeah, I'd be in favor of avoiding this as it seems like it'll make the code quite a bit harder to work with (since it'll only compile in one mode at a time, so changes to these APIs will need different builds to validate that both implementatios are correctly updated, etc), concerns around testing, etc

================
Comment at: llvm/lib/Analysis/ML/InliningModelRunnerProduction.h:116
+
+InliningModelRunner::~InliningModelRunner() {}
+
----------------
Prefer `= default` where possible

================
Comment at: llvm/lib/Analysis/ML/InliningModelRunnerTraining.h:24-45
+
+static cl::opt<std::string>
+    TrainingLog("training-log", cl::Hidden,
+                cl::desc("Path where the inlining log is saved."));
+
+static cl::opt<std::string> TFTrainedModelPath(
+    "ml-inliner-trained-model", cl::Hidden,
----------------
More static things in headers which will need to move to an implementation file

================
Comment at: llvm/lib/Analysis/ML/InliningModelRunnerTraining.h:172-173
+  if (isUsingInference())
+    return *(
+        static_cast<int64_t *>(TF_TensorData(Evaluator->getInput()[Index])));
+  return FeatureStorage[Index];
----------------
Looks like an extra set of () here that could be removed

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77752/new/

https://reviews.llvm.org/D77752