[llvm-dev] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations

Thu Mar 26 09:41:51 PDT 2020

Hi Shiva,

apologies for the delayed response.

On 3/24/20 4:13 AM, Shiva Stanford via llvm-dev wrote:
 > I am a grad CS student at Stanford and wanted to engage with EJ Park,
 > Giorgis Georgakoudis, Johannes Doerfert to further develop the Machine
 > Learning and Compiler Optimization concept.

Cool!

 > My background is in machine learning, cluster computing, distributed
 > systems etc. I am a good C/C++ developer and have a strong background in
 > algorithms and data structure.

Sounds good.

 > I am also taking an advanced compiler course this quarter at 
Stanford. So I
 > would be studying several of these topics anyways - so I thought I 
might as
 > well co-engage on the LLVM compiler infra project.

Agreed ;)

 > I am currently studying the background information on SCC Call Graphs,
 > Dominator Trees and other Global and inter-procedural analysis to lay 
some
 > ground work on how to tackle this optimization pass using ML models. 
I have
 > run a couple of all program function passes and visualized call graphs to
 > get familiarized with the LLVM optimization pass setup. I have also setup
 > and learnt the use of GDB to debug function pass code.

Very nice.

 > I have submitted the ML and Compiler Optimization proposal to GSOC 
2020. I
 > have added an additional feature to enhance the ML optimization to 
include
 > crossover code to GPU and investigate how the function call graphs can be
 > visualized as SCCs across CPU and GPU implementations. If the 
extension to
 > GPU is too much for a summer project, potentially we can focus on
 > developing a framework for studying SCCs across a unified CPU, GPU setup
 > and leave the coding, if feasible, to next Summer. All preliminary ideas.

I haven't looked at the proposals yet (I think we can only after the
deadline). TBH, I'm not sure I fully understand your extension. Also,
full disclosure, the project is pretty open-ended from my side at least.
I do not necessarily believe we (=llvm) is ready for a ML driven pass or
even inference in practice. What I want is to explore the use of ML to
improve the code we have, especially heuristics. We build analysis and
transformations but it is hard to combine them in a way that balances
compile-time, code-size, and performance.

Some high-level statements that might help to put my view into
perspective:

I want to use ML to identify patterns and code features that we can
check for using common techniques but when we base our decision making
on these patterns or features we achieve better compile-time, code-size,
and/or performance.
I want to use ML to identify shortcomings in our existing heuristics,
e.g. transformation cut-off values or pass schedules. This could also
mean to identify alternative (combination of) values that perform
substantially better (on some inputs).

 > Not sure how to proceed from here. Hence my email to this list. 
Please let
 > me know.

The email to the list was a great first step. The next one usually is to
setup an LLVM development and testing environment, thus LLVM + Clang +
LLVM-Test Suite that you can use. It is also advised to work on a small
task before the GSoC to get used to the LLVM development.

I don't have a really small ML "coding" task handy right now but the
project is more about experiments anyway. To get some LLVM development
experience we can just take a small task in the IPO Attributor pass.

One thing we need and we don't have is data. The Attributor is a
fixpoint iteration framework so the number of iterations is pretty
integral part. We have a statistics counter to determine if the number
required was higher than the given threshold but not one to determine
the maximum iteration count required during compilation. It would be
great if you could add that, thus a statistics counter that shows how
many iterations where required until a fixpoint was found across all
invocations of the Attributor. Does this make sense? Let me know what
you think and feel free to ask questions via email or on IRC.

Cheers,
   Johannes

P.S. Check out the coding style guide and the how to contribute guide!

 > Thank you
 > Shiva Badruswamy
 > shivastanford at gmail.com
 >
 >
 > _______________________________________________
 > LLVM Developers mailing list
 > llvm-dev at lists.llvm.org
 > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev