[PATCH] D142642: [mlgo] Introduce an "InteractiveModelRunner"

Mircea Trofin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 26 08:42:31 PST 2023


mtrofin created this revision.
mtrofin added a reviewer: kazu.
Herald added a subscriber: hiraditya.
Herald added a project: All.
mtrofin requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

This is a model runner for ML researchers using environments like
CompilerGym. In such environments, researchers host the compiler and
want to be able to observe the problem space (features) at each decision
step of some optimization pass, at which point the compiler is stopped,
waiting for the host makes a decision and provide an advice back to
the compiler, which then continues its normal operation, and so on.

The InteractiveModelRunner supports this scenario for the feature set
exposed by the compiler at a given time. It uses 2 files - ideally FIFO
pipes - one to pass data to the host, the other to get advices back from
the host. This means this scenario is supported with no special
dependencies. The file creation and deletion is the responsibility of
the host. Hooking up this model evaluator to a MLGO-ed pass is the
responsibilty of the pass author, and subsequent patches will do so for
the current set of mlgo passes, and offer an API to easily "just opt in"
by default when mlgo-ing a new pass.

The data protocol is that of the training logger: the host sees a training
log doled out observation by observation by reading from one of the
files, and passes back its advice as a serialized tensor (i.e. tensor value
memory dump) via the other file.

There are some differences wrt the log seen during training: the
interactive model doesn't currently include the outcome (because it should be
identical to the decision, and it's also not present in the "release"
mode); and partial rewards aren't currently communicated back.

The assumption - just like with the training logger - is that the host
is co-located, thus avoiding any endianness concerns. In a distributed
environment, it is up to the hosting infrastructure to intermediate
that.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D142642

Files:
  llvm/include/llvm/Analysis/InteractiveModelRunner.h
  llvm/include/llvm/Analysis/MLModelRunner.h
  llvm/include/llvm/Analysis/Utils/TrainingLogger.h
  llvm/lib/Analysis/CMakeLists.txt
  llvm/lib/Analysis/InteractiveModelRunner.cpp
  llvm/unittests/Analysis/MLModelRunnerTest.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D142642.492469.patch
Type: text/x-patch
Size: 11237 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230126/a5b41cbf/attachment.bin>


More information about the llvm-commits mailing list