[PATCH] D146483: Add infrastructure to use EmitC-generated models for inlining.

Thu Apr 27 10:42:07 PDT 2023

jacobhegna added inline comments.

================
Comment at: llvm/include/llvm/Analysis/EmitCModelRegistry.h:55
+
+  std::unordered_map<std::string, std::unique_ptr<ModelT>> Models;
+};
----------------
mtrofin wrote:
> `StringMap`?
Fixed.

================
Comment at: llvm/include/llvm/Analysis/EmitCModelRegistry.h:60
+// an object of this type is all you need to do to register the model.
+template <class ModelT> class EmitCModelRegistrationHandle {
+public:
----------------
mtrofin wrote:
> why do we need this, wouldn't the macro below be sufficient?
we need to call `EmitCModelRegistry<ModelT>::get().addModel(...)` before `main()` runs in the final linked binary. the easiest way to do that is to define a global object whose constructor runs that code - that's the  point of the `EmitCModelRegistrationHandle`. We can't (as far as I know) just have the macro because then we wouldn't have a "context" to run the needed code (right now, that context is the constructor of `...Handle`.

this is basically how `cl::opt` works, too.

Note that this mechanism is about to change when I update the patch. Instead of registering instantiated `unique_ptr`s of each model, I register a factory method which is capable of creating a `MLModelRunner` pointer. The reason for the change is because now that the emitc models don't share a common base class, there isn't a good way to store them all in a common registry (but we can store the `MLModelRunner*` because they use a subclass templated on the emitc model.

================
Comment at: llvm/include/llvm/Analysis/EmitCTensor.h:1
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
----------------
mtrofin wrote:
> won't this be part of the model package, i.e. something that's (maybe cloned) per-model?
It is true that the EmitC runtime includes all of this code. it's not quite cloned per-model right now, because in the python emitc code I rename the `Tensor` type to something like `_UnusedTensorType`.

The reason is that when EmitC generates models, the signature of the functions take `Tensor`s for input and output. If the definition of `Tensor` is in a local, anonymous namespace (aka the embedded runtime), then the code won't even compile because there won't be an accessible declaration for `Tensor` for the model header.

There are really only two solutions to this:
1) make `Tensor` a type that's declared in a common LLVM header that each model can `#include`, and so each model will use the same shared definition of `Tensor`. This is what I did in this commit.

2) make the model signatures only use `void*` or `float*` and keep the definition of `Tensor` as an internal implementation detail.

I talked with mircea offline, and (2) seems ok for us. it avoids exposing this code outside of generated models. It weakens the type in the exported function signatures, but that's not horrible because we already deal with void* internally (and we're the only user of this modified emitc backend)

================
Comment at: llvm/include/llvm/Analysis/EmitCTensor.h:1
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
----------------
marbre wrote:
> jacobhegna wrote:
> > mtrofin wrote:
> > > won't this be part of the model package, i.e. something that's (maybe cloned) per-model?
> > It is true that the EmitC runtime includes all of this code. it's not quite cloned per-model right now, because in the python emitc code I rename the `Tensor` type to something like `_UnusedTensorType`.
> > 
> > The reason is that when EmitC generates models, the signature of the functions take `Tensor`s for input and output. If the definition of `Tensor` is in a local, anonymous namespace (aka the embedded runtime), then the code won't even compile because there won't be an accessible declaration for `Tensor` for the model header.
> > 
> > There are really only two solutions to this:
> > 1) make `Tensor` a type that's declared in a common LLVM header that each model can `#include`, and so each model will use the same shared definition of `Tensor`. This is what I did in this commit.
> > 
> > 2) make the model signatures only use `void*` or `float*` and keep the definition of `Tensor` as an internal implementation detail.
> > 
> > I talked with mircea offline, and (2) seems ok for us. it avoids exposing this code outside of generated models. It weakens the type in the exported function signatures, but that's not horrible because we already deal with void* internally (and we're the only user of this modified emitc backend)
> I actually had the same question in mind and I am not sure if `llvm/include/llvm/Analysis` is a good location for this. However, we discussed some time ago if we should upstream the whole reference implementation, see https://discourse.llvm.org/t/tosa-reference-model-from-mlir-using-emitc/4799/11.
> 
> Furthermore, I don't know if it is acceptable to upstream code not licensed under `Apache-2.0 WITH LLVM exceptions`. The good news is, we would be able to re-license (most of) the reference implementation.
(see my above comment to mircea as well).

+1 to making the license change, that will be a blocker regardless of how we want to use the library. right now I don't use any of the Eigen code, so we can ignore that (if that was the main thing that can't be relicensed).

So, originally we were planning on putting the entire reference implementation in LLVM and having models #include it. The reason we didn't do this is essentially because we wanted two conflicting things:
1) models to maintain bit-identical stability over time - if you give it these bits as input today, it will give you the same answer today or in 6 months from today.
2) to be able to upgrade the runtime for performance/other reasons.

(2) can conflict with (1) because upgrades can create slight differences due to floating point numerical stability issues. So, instead we embed the runtime in each model individually so that new models get all the upgrades and old models get stability.

================
Comment at: llvm/include/llvm/Analysis/MLInlineEmitCModel.h:1
+//===- MLInlineEmitCModel.h -- Model for inlining EmitC Models --*- C++ -*-===//
+//
----------------
mtrofin wrote:
> Nit: isn't it "Model for inlining Oz"?
file gone, not an issue anymore

================
Comment at: llvm/include/llvm/Analysis/MLInlineEmitCModel.h:35
+  // not strictly necessary.
+  virtual void set_inlining_default(emitc::Tensor<int64_t, 1> x) {}
+  virtual void set_step_type(emitc::Tensor<int32_t, 1> x) {}
----------------
mtrofin wrote:
> can't these not be surfaced to begin with?
the base model is going away, so not a problem anymore.

================
Comment at: llvm/lib/Analysis/MLInlinerEmitCRunner.h:26
+template <class T, int64_t... Dims>
+emitc::Tensor<T, Dims...> convertBufferToEmitCTensor(void *Buffer,
+                                                     TensorSpec Spec) {
----------------
mtrofin wrote:
> This should be in the emit-ed code, so `emitc` stuff shouldn't appear in any llvm-facing APIs.
fixed. this function actually doesn't exist anymore.

================
Comment at: llvm/lib/Analysis/MLInlinerEmitCRunner.h:63
+    size_t idx = static_cast<size_t>(FeatureIndex::cpp_name);                  \
+    Model->set_##py_name(convertBufferToEmitCTensor<int64_t, 1>(               \
+        getTensorUntyped(idx), InputSpecs[idx]));                              \
----------------
mtrofin wrote:
> can't we set the buffers in the ctor once? they don't get reallocated.
fixed.

================
Comment at: llvm/lib/Analysis/models/emitc/InlineOzTestModel.emitc.h:20
+namespace llvm::emitc::generated {
+class InlineOzTestModel : public ::llvm::MLInlineOzEmitCModel {
+private:
----------------
mtrofin wrote:
> Not sure what the inheritance relationship buys us. We already have a common mechanism for referring to features by their enum-named index. This also seems to add a tight coupling between codegen and llvm.
> 
> If all we need is a way to identify groups of models in the registry, that could be done with the address of a static specific to each consumer, i.e. MLInlineAdvisor could have a static char ID = '\0' and the address of that ID is the key (like the legacy PM does things, too) 
fixed, but not with the static char trick. see the patch update for details

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146483/new/

https://reviews.llvm.org/D146483