[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations

Sun Feb 7 06:31:53 PST 2021

Hello Johannes,

I guess working on the loop transformations is a good starting point –
firstly, it is similar to the already existing code, and secondly, this
problem doesn't look way too hard (especially comparing with other ideas).
To the best of my understanding, it corresponds to refactoring
`LoopUnrollAnalyzer` and, if I understood you correctly, `MacroFusion`, in
the similar way it has been done with the `InlineAdvisor`.

As for the next step, I think that knowledge distillation is a promising
idea – in fact, we can experiment with the approaches from [1], which can
yield a nice inference speed-up in those models.

I think working on some kind of unified pipeline for pass selection and
ordering is also an interesting idea – off the top of my head, a viable
approach here is to consider a pass scheduling as a single-player game and
running a Monte-Carlo tree search to maximize some objective function. For
example, in [2] this kind of approach is used for learning to solve vertex
cover and max-cut, while [3] employs this approach for searching for the
molecule design with the specified properties. See also [4] for a survey of
RL methods (including MCTS) for combinatorial problems.

In case your interested in a runtime topic, I really would love to
> have a predictor for grid/block/thread block size for (OpenMP) GPU
> kernels. We are having real trouble on that end.

I'm afraid I didn't quite understand this one – could you elaborate a bit
more on this topic?

Best regards,
Konstantin Sidorov

[1]
https://github.com/lhyfst/knowledge-distillation-papers#recommended-papers
[2] Abe, Kenshin et al. Solving NP-Hard Problems on Graphs with Extended
AlphaGo Zero. https://arxiv.org/abs/1905.11623
[3] Kajita, S., Kinjo, T. & Nishi, T. Autonomous molecular design by
Monte-Carlo tree search and rapid evaluations using molecular dynamics
simulations. Commun Phys 3, 77 (2020).
https://doi.org/10.1038/s42005-020-0338-y
[4] Mazyavkina, Nina et al. Reinforcement Learning for Combinatorial
Optimization: A Survey. https://arxiv.org/abs/2003.03600

сб, 6 февр. 2021 г. в 23:28, Johannes Doerfert <johannesdoerfert at gmail.com>:

> Hi Konstantin,
>
> I didn't find the time to write new GSoC projects for 2021 yet but
> if you are interested we could probably set one up in this area. I
> also CC'ed Mircea who might be interested in this too, maybe as
> (co-)mentor.
>
> We could look at loop transformations such as unrolling and fusion,
> similar to the inliner work. Best case, we can distill a heuristic
> out of a model we learned. We could also  look at pass selection and
> ordering. We started last year and I was hoping to continue. You
> might want to watch https://youtu.be/TSMputNvHlk?t=617
> <https://youtu.be/TSMputNvHlk?t=617> and
> https://youtu.be/nxfew3hsMFM?t=1435 <https://youtu.be/nxfew3hsMFM?t=1435>
> .
>
> In case your interested in a runtime topic, I really would love to
> have a predictor for grid/block/thread block size for (OpenMP) GPU
> kernels. We are having real trouble on that end.
>
> I also would like to look at ML use in testing and CI.
>
> Let me know what area sounds most interesting to you and we can
> take it from there.
>
> ~ Johannes
>
>
> On 2/6/21 4:35 AM, Сидоров , Константин Сергеевич via llvm-dev wrote:
> > Dear all,
> >
> > I would like to continue the discussion of the GSoC project I mentioned
> in
> > the previous email. Now, when I know my way around the LLVM codebase, I
> > would like to propose the first draft of the plan:
> >
> > * Improving heuristics for existing passes – to start the discussion, I
> > propose to start the project by working on `MLInlineAdvisor` (as far as I
> > understand, in this class the ML infrastructure is already developed, and
> > thus it seems to be a good idea to start there) and after that switching
> to
> > the other passes (e.g., `LoopVectorizationPlanner` seems to be a good
> > candidate for such an approach, and `LoopRotate` class contains a
> > profitability heuristic which could also be studied deeper).
> > * Machine learning models to select the optimizations – to the best of my
> > understanding, the key concept here is the pass manager, but here I don't
> > quite understand the technical details of deciding which optimization to
> > select. For this reason, I would like to discuss this part more
> thoroughly.
> >
> > If the project mentors are reading this mailing list and are interested
> in
> > the discussion, can we start the discussion here?
> >
> > By the way – I would like to thank Stefanos for the comprehensive
> response
> > to my previous questions that helped me to get started :)
> >
> > Looking forward to a further discussion,
> > Konstantin Sidorov
> >
> > вт, 19 янв. 2021 г. в 07:04, Сидоров , Константин Сергеевич <
> > sidorov.ks at phystech.edu>:
> >
> >> Dear all,
> >>
> >> My name is Konstantin Sidorov, and I am a graduate student in
> Mathematics
> >> at Moscow Institute of Physics and Technology.
> >>
> >> I would like to work on a project "Machine learning and compiler
> >> optimizations: using inter-procedural analysis to select optimizations"
> >> during the Google Summer of Code 2021.
> >>
> >> I have an extensive background relevant to this project - in particular:
> >>
> >> * I have already participated in GSoC before in 2017 with mlpack
> >> organization on the project "Augmented RNNs":
> >>
> https://summerofcode.withgoogle.com/archive/2017/projects/4583913502539776/
> >> * In 2019 I have graduated from the Yandex School of Data Analysis — a
> >> two-year program in Data Analysis by Yandex (the leading Russian search
> >> engine); more info on the curriculum could be also found at
> >> https://yandexdataschool.com/.
> >> * I have also been working as a software engineer at Adeptik from July
> >> 2018 to date, where I have predominantly worked on projects on applied
> >> combinatorial optimization problems, such as vehicle-routing problems or
> >> supply chain modeling. In particular, I have had experience with both
> >> metaheuristic algorithms (e.g., local search or genetic algorithms) and
> >> more "traditional" mathematical modeling (e.g., linear programming or
> >> constraint programming).
> >>
> >> I would like to discuss this project in more detail. While it is hard to
> >> discuss any kind of exact plan at this stage, I already have two
> questions
> >> concerning this project:
> >>
> >> (1) I have set up an LLVM dev environment, but I am unsure what to do
> >> next. Could you advise me on any simple (and, preferably, relevant)
> tasks
> >> to work on?
> >> (2) Could you suggest any learning materials to improve the
> understanding
> >> of "low-level" concepts? (E.g., CPU concepts such as caching and SIMD)
> >>
> >> Best regards,
> >> Konstantin Sidorov
> >>
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210207/10f96e52/attachment.html>