[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations

Sun Feb 7 15:30:21 PST 2021

Hi Konstantin,

On 2/7/21 8:31 AM, Сидоров , Константин Сергеевич wrote:
> Hello Johannes,
>
> I guess working on the loop transformations is a good starting point –
> firstly, it is similar to the already existing code, and secondly, this
> problem doesn't look way too hard (especially comparing with other ideas).
> To the best of my understanding, it corresponds to refactoring
> `LoopUnrollAnalyzer` and, if I understood you correctly, `MacroFusion`, in
> the similar way it has been done with the `InlineAdvisor`.

That sounds good to me. As you said, we'd have a nice "template"
to guide such exploration. Given that GSoC is short this year that
sounds certainly better than a big standalone project.

>
> As for the next step, I think that knowledge distillation is a promising
> idea – in fact, we can experiment with the approaches from [1], which can
> yield a nice inference speed-up in those models.

What I meant was that it would be nice if we not only come up with
an ML advisor but also an improved non-ML predictor based on the
insights we can extract from the model. I'm not sure if that came
across right.

>
> I think working on some kind of unified pipeline for pass selection and
> ordering is also an interesting idea – off the top of my head, a viable
> approach here is to consider a pass scheduling as a single-player game and
> running a Monte-Carlo tree search to maximize some objective function. For
> example, in [2] this kind of approach is used for learning to solve vertex
> cover and max-cut, while [3] employs this approach for searching for the
> molecule design with the specified properties. See also [4] for a survey of
> RL methods (including MCTS) for combinatorial problems.

The caveat is, we actually don't want to learn arbitrary pass
pipelines because they are simply not practical. That said, we
could define/learn building blocks which we then combine. I would
however start small here was well, e.g., take a canonicalization
pass, e.g., simplifycfg, and learn when in the pipeline we should
run it, potentially based on IR features.

>
> In case your interested in a runtime topic, I really would love to
>> have a predictor for grid/block/thread block size for (OpenMP) GPU
>> kernels. We are having real trouble on that end.
> I'm afraid I didn't quite understand this one – could you elaborate a bit
> more on this topic?

So the problem is as follows:

Given a GPU kernel, how many thread blocks and threads per thread
block should be use. This can heavily impact performance and so far
we use semi-arbitrary numbers in the OpenMP GPU offloading runtime.

We would use IR information, PTX information, GPU information, and
kernel parameters, to learn and eventually make a decision.

Does that make more sense?

~ Johannes

P.S. Given that we are defining a project via email I will not write
it up for the open project page. Instead, I expect you simply apply
to GSoC with some description we come up here.

>
> Best regards,
> Konstantin Sidorov
>
> [1]
> https://github.com/lhyfst/knowledge-distillation-papers#recommended-papers
> [2] Abe, Kenshin et al. Solving NP-Hard Problems on Graphs with Extended
> AlphaGo Zero. https://arxiv.org/abs/1905.11623
> [3] Kajita, S., Kinjo, T. & Nishi, T. Autonomous molecular design by
> Monte-Carlo tree search and rapid evaluations using molecular dynamics
> simulations. Commun Phys 3, 77 (2020).
> https://doi.org/10.1038/s42005-020-0338-y
> [4] Mazyavkina, Nina et al. Reinforcement Learning for Combinatorial
> Optimization: A Survey. https://arxiv.org/abs/2003.03600
>
> сб, 6 февр. 2021 г. в 23:28, Johannes Doerfert <johannesdoerfert at gmail.com>:
>
>> Hi Konstantin,
>>
>> I didn't find the time to write new GSoC projects for 2021 yet but
>> if you are interested we could probably set one up in this area. I
>> also CC'ed Mircea who might be interested in this too, maybe as
>> (co-)mentor.
>>
>> We could look at loop transformations such as unrolling and fusion,
>> similar to the inliner work. Best case, we can distill a heuristic
>> out of a model we learned. We could also  look at pass selection and
>> ordering. We started last year and I was hoping to continue. You
>> might want to watch https://youtu.be/TSMputNvHlk?t=617
>> <https://youtu.be/TSMputNvHlk?t=617> and
>> https://youtu.be/nxfew3hsMFM?t=1435 <https://youtu.be/nxfew3hsMFM?t=1435>
>> .
>>
>> In case your interested in a runtime topic, I really would love to
>> have a predictor for grid/block/thread block size for (OpenMP) GPU
>> kernels. We are having real trouble on that end.
>>
>> I also would like to look at ML use in testing and CI.
>>
>> Let me know what area sounds most interesting to you and we can
>> take it from there.
>>
>> ~ Johannes
>>
>>
>> On 2/6/21 4:35 AM, Сидоров , Константин Сергеевич via llvm-dev wrote:
>>> Dear all,
>>>
>>> I would like to continue the discussion of the GSoC project I mentioned
>> in
>>> the previous email. Now, when I know my way around the LLVM codebase, I
>>> would like to propose the first draft of the plan:
>>>
>>> * Improving heuristics for existing passes – to start the discussion, I
>>> propose to start the project by working on `MLInlineAdvisor` (as far as I
>>> understand, in this class the ML infrastructure is already developed, and
>>> thus it seems to be a good idea to start there) and after that switching
>> to
>>> the other passes (e.g., `LoopVectorizationPlanner` seems to be a good
>>> candidate for such an approach, and `LoopRotate` class contains a
>>> profitability heuristic which could also be studied deeper).
>>> * Machine learning models to select the optimizations – to the best of my
>>> understanding, the key concept here is the pass manager, but here I don't
>>> quite understand the technical details of deciding which optimization to
>>> select. For this reason, I would like to discuss this part more
>> thoroughly.
>>> If the project mentors are reading this mailing list and are interested
>> in
>>> the discussion, can we start the discussion here?
>>>
>>> By the way – I would like to thank Stefanos for the comprehensive
>> response
>>> to my previous questions that helped me to get started :)
>>>
>>> Looking forward to a further discussion,
>>> Konstantin Sidorov
>>>
>>> вт, 19 янв. 2021 г. в 07:04, Сидоров , Константин Сергеевич <
>>> sidorov.ks at phystech.edu>:
>>>
>>>> Dear all,
>>>>
>>>> My name is Konstantin Sidorov, and I am a graduate student in
>> Mathematics
>>>> at Moscow Institute of Physics and Technology.
>>>>
>>>> I would like to work on a project "Machine learning and compiler
>>>> optimizations: using inter-procedural analysis to select optimizations"
>>>> during the Google Summer of Code 2021.
>>>>
>>>> I have an extensive background relevant to this project - in particular:
>>>>
>>>> * I have already participated in GSoC before in 2017 with mlpack
>>>> organization on the project "Augmented RNNs":
>>>>
>> https://summerofcode.withgoogle.com/archive/2017/projects/4583913502539776/
>>>> * In 2019 I have graduated from the Yandex School of Data Analysis — a
>>>> two-year program in Data Analysis by Yandex (the leading Russian search
>>>> engine); more info on the curriculum could be also found at
>>>> https://yandexdataschool.com/.
>>>> * I have also been working as a software engineer at Adeptik from July
>>>> 2018 to date, where I have predominantly worked on projects on applied
>>>> combinatorial optimization problems, such as vehicle-routing problems or
>>>> supply chain modeling. In particular, I have had experience with both
>>>> metaheuristic algorithms (e.g., local search or genetic algorithms) and
>>>> more "traditional" mathematical modeling (e.g., linear programming or
>>>> constraint programming).
>>>>
>>>> I would like to discuss this project in more detail. While it is hard to
>>>> discuss any kind of exact plan at this stage, I already have two
>> questions
>>>> concerning this project:
>>>>
>>>> (1) I have set up an LLVM dev environment, but I am unsure what to do
>>>> next. Could you advise me on any simple (and, preferably, relevant)
>> tasks
>>>> to work on?
>>>> (2) Could you suggest any learning materials to improve the
>> understanding
>>>> of "low-level" concepts? (E.g., CPU concepts such as caching and SIMD)
>>>>
>>>> Best regards,
>>>> Konstantin Sidorov
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev