[llvm-dev] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations

Mon Mar 30 19:28:10 PDT 2020

Hi Johannes:

1. Attached is the submitted PDF.
2. I have a notes section where I state: I am still unsure of the GPU
extension I proposed as I dont know how LLVM plays into the GPU cross over
space like how nvcc (Nvidia's compiler integrates gcc and PTX) does.I dont
know if there is a chance that function graphs in the CPU+GPU name spaces
are seamless/continupus within nvcc or if nvcc is just a wrapper that
invokes gcc on the cpu sources and ptx on the gpu sources. So what I have
said is  - if there is time to investigate we could look  at this. But I am
not sure I am even framing the problem statement correctly at this point.
3. I have added a tentative tasks section and made a note that the project
is open ended and things are quite fluid and may change significantly.

Cheers
Shiva

On Mon, Mar 30, 2020 at 6:52 PM Johannes Doerfert <
johannesdoerfert at gmail.com> wrote:

> On 3/30/20 8:07 PM, Shiva Stanford wrote:
>  > 1. Thanks for the clarifications. I will stick to non-containerized OS X
>  > for now.
>
> Sounds good. As long as you can build it and run lit and llvm-test suite
> tests :)
>
>
>  > 2. As an aside, I did try to build a Debian docker container by git
> cloning
>  > into it and using the Dockerfile in LLVM/utils/docker as a starting
> point:
>  >  - some changes needed to updated packages (GCC in particular needs to
> be
>  > latest) and the Debian image (Debian 9 instead of Debian 8) pretty much
>  > sets up the docker container well. But for some reason, the Ninja build
>  > tool within the CMake Generator fails. I am looking into it. Maybe I can
>  > produce a working docker workflow for others who want to build and work
>  > with LLVM in a container environment.
>
> Feel free to propose a fix but I'm the wrong one to talk to ;)
>
>
>  > 3. I have submitted the final proposal today to GSoC 2020 today after
>  > incorporating some comments and thoughts. When you all get a chance to
>  > review, let me know your thoughts.
>
> Good. Can you share the google docs with me
> (johannesdoerfert at gmail.com)? [Or did you and I misplaced the link? In
> that case send it again ;)]
>
>
>  > 4. On GPU extension, my thoughts were around what an integrated compiler
>  > like Nvidia's nvcc (GCC for CPU + PTX for GPU) does when GCC is
> substituted
>  > with LLVM and if that arrangement can be optimized for ML passes.
>  > But I am beginning to think that structuring this problem well and
>  > doing meaningful work over the summer might be a bit difficult.
>
> As far as I know, neither GCC nor Clang will behave much differently if
> they are used by nvcc than in their standalone mode.
>
> Having an "ML-mode" is probably a generic thing to look at. Though, the
> "high-level" optimizations are not necessarily performed in LLVM-IR.
>
>
>  > As mentors, do you have any thoughts on how LLVM might be integrated
>  > into a joint CPU-GPU compiler by the likes of Nvidia, Apple etc.?
>
> I'm unsure what you ask exactly. Clang can be used in CPU-GPU
> compilation via Cuda, OpenCL, OpenMP offload, Sycl, ... is this it?
> I'm personally mostly interested in generic optimizations in this space
> but actually quite interested. Some ideas:
>   - transfer latency hiding (another GSoC project),
>   - kernel granularity optimizations (not worked being worked on yet but
>     requires some infrastructe changes that are as of now still in the
>     making),
>   - data "location" tracking so we can "move" computation to the right
>     device, e.g., for really dependence free loops like `pragma omp loop`
>
> I can list more things but I'm unsure this is the direction you were
> thinking.
>
> Cheers,
>    Johannes
>
>  > Best
>  > Shiva
>  >
>  >
>  >
>  > On Mon, Mar 30, 2020 at 5:30 PM Johannes Doerfert <
>  > johannesdoerfert at gmail.com> wrote:
>  >
>  >>
>  >> On 3/27/20 3:46 PM, Shiva Stanford wrote:
>  >>> Hi Johannes - great we are engaging on this.
>  >>>
>  >>> Some responses now and some later.
>  >>>
>  >>> 1. When you say setup LLVM dev environment +. clang + tools etc, do
> you
>  >>> mean setup LLVM compiler code from the repo and build it locally?
> If so,
>  >>> yes, this is all done from my end - that is, I have built all this
> on my
>  >>> machine and compiled and run a couple of function passes. I have
> look at
>  >>> some LLVM emits from clang tools but I will familiarize more. I have
>  >> added
>  >>> some small code segments, modified CMAKE Lists and re-built code to
> get a
>  >>> feel for the packaging  structure. Btw, is there a version of Basel
> build
>  >>> for this? Right now, I am using OS X as the SDK as Apple is the one
> that
>  >>> has adopted LLVM the most. But I can switch to Linux containers to
>  >>> completely wall off the LLVM build against any OS X system builds to
>  >>> prevent path obfuscation and truly have a separate address space. Is
>  >> there
>  >>> a preferable environment? In any case, I am thinking of containerizing
>  >> the
>  >>> build, so OS X system paths don't interfere with include paths -
> have you
>  >>> received feedback from other developers on whether the include paths
>  >>> interfere with OS X LLVM system build?
>  >>
>  >>
>  >> Setup sounds good.
>  >>
>  >> I have never used OS X but people do and I would expect it to be OK.
>  >>
>  >> I don't think you need to worry about this right now.
>  >>
>  >>
>  >>> 2. The attributor pass refactoring gives some specific direction as a
>  >>> startup project - so that's great. Let me study this pass and I
> will get
>  >>> back to you with more questions.
>  >>
>  >> Sure.
>  >>
>  >>
>  >>> 3. Yes, I will stick to the style guide (Baaaah...Stanford is strict
> on
>  >>> code styling and so are you guys :)) for sure.
>  >>
>  >> For better or worse.
>  >>
>  >>
>  >> Cheers,
>  >>
>  >>    Johannes
>  >>
>  >>
>  >>
>  >>> On Thu, Mar 26, 2020 at 9:42 AM Johannes Doerfert <
>  >>> johannesdoerfert at gmail.com> wrote:
>  >>>
>  >>>> Hi Shiva,
>  >>>>
>  >>>> apologies for the delayed response.
>  >>>>
>  >>>> On 3/24/20 4:13 AM, Shiva Stanford via llvm-dev wrote:
>  >>>>   > I am a grad CS student at Stanford and wanted to engage with EJ
>  >> Park,
>  >>>>   > Giorgis Georgakoudis, Johannes Doerfert to further develop the
>  >> Machine
>  >>>>   > Learning and Compiler Optimization concept.
>  >>>>
>  >>>> Cool!
>  >>>>
>  >>>>
>  >>>>   > My background is in machine learning, cluster computing,
> distributed
>  >>>>   > systems etc. I am a good C/C++ developer and have a strong
>  >> background in
>  >>>>   > algorithms and data structure.
>  >>>>
>  >>>> Sounds good.
>  >>>>
>  >>>>
>  >>>>   > I am also taking an advanced compiler course this quarter at
>  >>>> Stanford. So I
>  >>>>   > would be studying several of these topics anyways - so I thought
> I
>  >>>> might as
>  >>>>   > well co-engage on the LLVM compiler infra project.
>  >>>>
>  >>>> Agreed ;)
>  >>>>
>  >>>>
>  >>>>   > I am currently studying the background information on SCC Call
>  >> Graphs,
>  >>>>   > Dominator Trees and other Global and inter-procedural analysis to
>  >> lay
>  >>>> some
>  >>>>   > ground work on how to tackle this optimization pass using ML
> models.
>  >>>> I have
>  >>>>   > run a couple of all program function passes and visualized call
>  >> graphs
>  >>>> to
>  >>>>   > get familiarized with the LLVM optimization pass setup. I have
> also
>  >>>> setup
>  >>>>   > and learnt the use of GDB to debug function pass code.
>  >>>>
>  >>>> Very nice.
>  >>>>
>  >>>>
>  >>>>   > I have submitted the ML and Compiler Optimization proposal to
> GSOC
>  >>>> 2020. I
>  >>>>   > have added an additional feature to enhance the ML optimization
> to
>  >>>> include
>  >>>>   > crossover code to GPU and investigate how the function call
> graphs
>  >> can
>  >>>> be
>  >>>>   > visualized as SCCs across CPU and GPU implementations. If the
>  >>>> extension to
>  >>>>   > GPU is too much for a summer project, potentially we can focus on
>  >>>>   > developing a framework for studying SCCs across a unified CPU,
> GPU
>  >> setup
>  >>>>   > and leave the coding, if feasible, to next Summer. All
> preliminary
>  >>>> ideas.
>  >>>>
>  >>>> I haven't looked at the proposals yet (I think we can only after the
>  >>>> deadline). TBH, I'm not sure I fully understand your extension. Also,
>  >>>> full disclosure, the project is pretty open-ended from my side at
> least.
>  >>>> I do not necessarily believe we (=llvm) is ready for a ML driven
> pass or
>  >>>> even inference in practice. What I want is to explore the use of ML
> to
>  >>>> improve the code we have, especially heuristics. We build analysis
> and
>  >>>> transformations but it is hard to combine them in a way that balances
>  >>>> compile-time, code-size, and performance.
>  >>>>
>  >>>> Some high-level statements that might help to put my view into
>  >>>> perspective:
>  >>>>
>  >>>> I want to use ML to identify patterns and code features that we can
>  >>>> check for using common techniques but when we base our decision
> making
>  >>>> on these patterns or features we achieve better compile-time,
> code-size,
>  >>>> and/or performance.
>  >>>> I want to use ML to identify shortcomings in our existing heuristics,
>  >>>> e.g. transformation cut-off values or pass schedules. This could also
>  >>>> mean to identify alternative (combination of) values that perform
>  >>>> substantially better (on some inputs).
>  >>>>
>  >>>>
>  >>>>   > Not sure how to proceed from here. Hence my email to this list.
>  >>>> Please let
>  >>>>   > me know.
>  >>>>
>  >>>> The email to the list was a great first step. The next one usually
> is to
>  >>>> setup an LLVM development and testing environment, thus LLVM + Clang
> +
>  >>>> LLVM-Test Suite that you can use. It is also advised to work on a
> small
>  >>>> task before the GSoC to get used to the LLVM development.
>  >>>>
>  >>>> I don't have a really small ML "coding" task handy right now but the
>  >>>> project is more about experiments anyway. To get some LLVM
> development
>  >>>> experience we can just take a small task in the IPO Attributor pass.
>  >>>>
>  >>>> One thing we need and we don't have is data. The Attributor is a
>  >>>> fixpoint iteration framework so the number of iterations is pretty
>  >>>> integral part. We have a statistics counter to determine if the
> number
>  >>>> required was higher than the given threshold but not one to determine
>  >>>> the maximum iteration count required during compilation. It would be
>  >>>> great if you could add that, thus a statistics counter that shows how
>  >>>> many iterations where required until a fixpoint was found across all
>  >>>> invocations of the Attributor. Does this make sense? Let me know what
>  >>>> you think and feel free to ask questions via email or on IRC.
>  >>>>
>  >>>> Cheers,
>  >>>>     Johannes
>  >>>>
>  >>>> P.S. Check out the coding style guide and the how to contribute
> guide!
>  >>>>
>  >>>>
>  >>>>   > Thank you
>  >>>>   > Shiva Badruswamy
>  >>>>   > shivastanford at gmail.com
>  >>>>   >
>  >>>>   >
>  >>>>   > _______________________________________________
>  >>>>   > LLVM Developers mailing list
>  >>>>   > llvm-dev at lists.llvm.org
>  >>>>   > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>  >>>>
>  >>>>
>  >>
>  >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200330/6d40ad8d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Final_Proposal_ MachineLearningAndCompilerOptimization.pdf
Type: application/pdf
Size: 40819 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200330/6d40ad8d/attachment-0001.pdf>