[llvm-dev] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations
Shiva Stanford via llvm-dev
llvm-dev at lists.llvm.org
Tue Mar 31 03:22:54 PDT 2020
1. Draft proposals via gdoc. Final via PDF.
2. I did not see any timeline requests from GSoC but spring quarter ends
June 6 or so or maybe by a week more due to Coronavirus schedule delays.
Summer begins then. I will look into it some more in the morning and see
what I can add to timelines.
Thanks.
On Mon, Mar 30, 2020 at 11:43 PM Johannes Doerfert <
johannesdoerfert at gmail.com> wrote:
>
> On 3/30/20 9:28 PM, Shiva Stanford wrote:
> > Hi Johannes:
> >
> > 1. Attached is the submitted PDF.
>
> I thought they make you submit via gdoc and I also thought they wanted a
> timeline and had other requirements. Please verify this so it's not a
> problem (I base this on the proposals I've seen this year and not on the
> information actually provided by GSoC).
>
>
> > 2. I have a notes section where I state: I am still unsure of the GPU
> > extension I proposed as I dont know how LLVM plays into the GPU cross
> over
> > space like how nvcc (Nvidia's compiler integrates gcc and PTX) does.
>
> You can use clang as "host compiler". As mentioned before, there is
> clang-cuda and OpenMP offloading also generates PTX for the GPU code.
>
>
> > I dont know if there is a chance that function graphs in the CPU+GPU
> > name spaces are seamless/continupus within nvcc or if nvcc is just a
> > wrapper that invokes gcc on the cpu sources and ptx on the gpu
> > sources.
>
> Something like that as far as I know.
>
>
> > So what I have said is - if there is time to investigate we could
> > look at this. But I am not sure I am even framing the problem
> > statement correctly at this point.
>
> As I said, I'd be very happy for you to also work on GPU related things,
> what exactly can be defined over the next weeks.
>
> GPU offloading is by nature inter-procedural (take CUDA kernels) so
> creating the infrastructure to alter the granularity of kernels
> (when/where to fuse/split them) could be a task. For this it is fairly
> important (as far as I know now) to predict the register usage
> accurately. Using learning here might be interesting as well.
>
> As you mention in the pdf, one can also split the index space to balance
> computation. When we implement something like `pragma omp loop` we can
> also balance computations across multiple GPUs as long as we get the
> data movement right.
>
>
> > 3. I have added a tentative tasks section and made a note that the
> > project is open ended and things are quite fluid and may change
> > significantly.
>
> That is good. This is a moving target and open ended task, I expect
> things to be determined more clearly as we go and based on the data we
> gather.
>
> Cheers,
> Johannes
>
>
> > Cheers Shiva
> >
> >
> > On Mon, Mar 30, 2020 at 6:52 PM Johannes Doerfert <
> > johannesdoerfert at gmail.com> wrote:
> >
> >> On 3/30/20 8:07 PM, Shiva Stanford wrote:
> >> > 1. Thanks for the clarifications. I will stick to
> >> > non-containerized OS X for now.
> >>
> >> Sounds good. As long as you can build it and run lit and llvm-test
> >> suite tests :)
> >>
> >>
> >> > 2. As an aside, I did try to build a Debian docker container by
> >> > git
> >> cloning
> >> > into it and using the Dockerfile in LLVM/utils/docker as a
> >> > starting
> >> point:
> >> > - some changes needed to updated packages (GCC in particular
> >> > needs to
> >> be
> >> > latest) and the Debian image (Debian 9 instead of Debian 8) pretty
> >> > much sets up the docker container well. But for some reason, the
> >> > Ninja build tool within the CMake Generator fails. I am looking
> >> > into it. Maybe I can produce a working docker workflow for others
> >> > who want to build and work with LLVM in a container environment.
> >>
> >> Feel free to propose a fix but I'm the wrong one to talk to ;)
> >>
> >>
> >> > 3. I have submitted the final proposal today to GSoC 2020 today
> >> > after incorporating some comments and thoughts. When you all get a
> >> > chance to review, let me know your thoughts.
> >>
> >> Good. Can you share the google docs with me
> >> (johannesdoerfert at gmail.com)? [Or did you and I misplaced the link?
> >> In that case send it again ;)]
> >>
> >>
> >> > 4. On GPU extension, my thoughts were around what an integrated
> >> > compiler like Nvidia's nvcc (GCC for CPU + PTX for GPU) does when
> >> > GCC is
> >> substituted
> >> > with LLVM and if that arrangement can be optimized for ML passes.
> >> > But I am beginning to think that structuring this problem well and
> >> > doing meaningful work over the summer might be a bit difficult.
> >>
> >> As far as I know, neither GCC nor Clang will behave much differently
> >> if they are used by nvcc than in their standalone mode.
> >>
> >> Having an "ML-mode" is probably a generic thing to look at. Though,
> >> the "high-level" optimizations are not necessarily performed in
> >> LLVM-IR.
> >>
> >>
> >> > As mentors, do you have any thoughts on how LLVM might be
> >> > integrated into a joint CPU-GPU compiler by the likes of Nvidia,
> >> > Apple etc.?
> >>
> >> I'm unsure what you ask exactly. Clang can be used in CPU-GPU
> >> compilation via Cuda, OpenCL, OpenMP offload, Sycl, ... is this it?
> >> I'm personally mostly interested in generic optimizations in this
> >> space but actually quite interested. Some ideas: - transfer latency
> >> hiding (another GSoC project), - kernel granularity optimizations
> >> (not worked being worked on yet but requires some infrastructe
> >> changes that are as of now still in the making), - data "location"
> >> tracking so we can "move" computation to the right device, e.g., for
> >> really dependence free loops like `pragma omp loop`
> >>
> >> I can list more things but I'm unsure this is the direction you were
> >> thinking.
> >>
> >> Cheers, Johannes
> >>
> >> > Best Shiva
> >> >
> >> >
> >> >
> >> > On Mon, Mar 30, 2020 at 5:30 PM Johannes Doerfert <
> >> > johannesdoerfert at gmail.com> wrote:
> >> >
> >> >>
> >> >> On 3/27/20 3:46 PM, Shiva Stanford wrote:
> >> >>> Hi Johannes - great we are engaging on this.
> >> >>>
> >> >>> Some responses now and some later.
> >> >>>
> >> >>> 1. When you say setup LLVM dev environment +. clang + tools etc,
> >> >>> do
> >> you
> >> >>> mean setup LLVM compiler code from the repo and build it
> >> >>> locally?
> >> If so,
> >> >>> yes, this is all done from my end - that is, I have built all
> >> >>> this
> >> on my
> >> >>> machine and compiled and run a couple of function passes. I have
> >> look at
> >> >>> some LLVM emits from clang tools but I will familiarize more. I
> >> >>> have
> >> >> added
> >> >>> some small code segments, modified CMAKE Lists and re-built code
> >> >>> to
> >> get a
> >> >>> feel for the packaging structure. Btw, is there a version of
> >> >>> Basel
> >> build
> >> >>> for this? Right now, I am using OS X as the SDK as Apple is the
> >> >>> one
> >> that
> >> >>> has adopted LLVM the most. But I can switch to Linux containers
> >> >>> to completely wall off the LLVM build against any OS X system
> >> >>> builds to prevent path obfuscation and truly have a separate
> >> >>> address space. Is
> >> >> there
> >> >>> a preferable environment? In any case, I am thinking of
> >> >>> containerizing
> >> >> the
> >> >>> build, so OS X system paths don't interfere with include paths -
> >> have you
> >> >>> received feedback from other developers on whether the include
> >> >>> paths interfere with OS X LLVM system build?
> >> >>
> >> >>
> >> >> Setup sounds good.
> >> >>
> >> >> I have never used OS X but people do and I would expect it to be
> >> >> OK.
> >> >>
> >> >> I don't think you need to worry about this right now.
> >> >>
> >> >>
> >> >>> 2. The attributor pass refactoring gives some specific direction
> >> >>> as a startup project - so that's great. Let me study this pass
> >> >>> and I
> >> will get
> >> >>> back to you with more questions.
> >> >>
> >> >> Sure.
> >> >>
> >> >>
> >> >>> 3. Yes, I will stick to the style guide (Baaaah...Stanford is
> >> >>> strict
> >> on
> >> >>> code styling and so are you guys :)) for sure.
> >> >>
> >> >> For better or worse.
> >> >>
> >> >>
> >> >> Cheers,
> >> >>
> >> >> Johannes
> >> >>
> >> >>
> >> >>
> >> >>> On Thu, Mar 26, 2020 at 9:42 AM Johannes Doerfert <
> >> >>> johannesdoerfert at gmail.com> wrote:
> >> >>>
> >> >>>> Hi Shiva,
> >> >>>>
> >> >>>> apologies for the delayed response.
> >> >>>>
> >> >>>> On 3/24/20 4:13 AM, Shiva Stanford via llvm-dev wrote:
> >> >>>> > I am a grad CS student at Stanford and wanted to engage
> >> >>>> > with EJ
> >> >> Park,
> >> >>>> > Giorgis Georgakoudis, Johannes Doerfert to further develop
> >> >>>> > the
> >> >> Machine
> >> >>>> > Learning and Compiler Optimization concept.
> >> >>>>
> >> >>>> Cool!
> >> >>>>
> >> >>>>
> >> >>>> > My background is in machine learning, cluster computing,
> >> distributed
> >> >>>> > systems etc. I am a good C/C++ developer and have a strong
> >> >> background in
> >> >>>> > algorithms and data structure.
> >> >>>>
> >> >>>> Sounds good.
> >> >>>>
> >> >>>>
> >> >>>> > I am also taking an advanced compiler course this quarter
> >> >>>> > at
> >> >>>> Stanford. So I
> >> >>>> > would be studying several of these topics anyways - so I
> >> >>>> > thought
> >> I
> >> >>>> might as
> >> >>>> > well co-engage on the LLVM compiler infra project.
> >> >>>>
> >> >>>> Agreed ;)
> >> >>>>
> >> >>>>
> >> >>>> > I am currently studying the background information on SCC
> >> >>>> > Call
> >> >> Graphs,
> >> >>>> > Dominator Trees and other Global and inter-procedural
> >> >>>> > analysis to
> >> >> lay
> >> >>>> some
> >> >>>> > ground work on how to tackle this optimization pass using
> >> >>>> > ML
> >> models.
> >> >>>> I have
> >> >>>> > run a couple of all program function passes and visualized
> >> >>>> > call
> >> >> graphs
> >> >>>> to
> >> >>>> > get familiarized with the LLVM optimization pass setup. I
> >> >>>> > have
> >> also
> >> >>>> setup
> >> >>>> > and learnt the use of GDB to debug function pass code.
> >> >>>>
> >> >>>> Very nice.
> >> >>>>
> >> >>>>
> >> >>>> > I have submitted the ML and Compiler Optimization proposal
> >> >>>> > to
> >> GSOC
> >> >>>> 2020. I
> >> >>>> > have added an additional feature to enhance the ML
> >> >>>> > optimization
> >> to
> >> >>>> include
> >> >>>> > crossover code to GPU and investigate how the function call
> >> graphs
> >> >> can
> >> >>>> be
> >> >>>> > visualized as SCCs across CPU and GPU implementations. If
> >> >>>> > the
> >> >>>> extension to
> >> >>>> > GPU is too much for a summer project, potentially we can
> >> >>>> > focus on developing a framework for studying SCCs across a
> >> >>>> > unified CPU,
> >> GPU
> >> >> setup
> >> >>>> > and leave the coding, if feasible, to next Summer. All
> >> preliminary
> >> >>>> ideas.
> >> >>>>
> >> >>>> I haven't looked at the proposals yet (I think we can only
> >> >>>> after the deadline). TBH, I'm not sure I fully understand your
> >> >>>> extension. Also, full disclosure, the project is pretty
> >> >>>> open-ended from my side at
> >> least.
> >> >>>> I do not necessarily believe we (=llvm) is ready for a ML
> >> >>>> driven
> >> pass or
> >> >>>> even inference in practice. What I want is to explore the use
> >> >>>> of ML
> >> to
> >> >>>> improve the code we have, especially heuristics. We build
> >> >>>> analysis
> >> and
> >> >>>> transformations but it is hard to combine them in a way that
> >> >>>> balances compile-time, code-size, and performance.
> >> >>>>
> >> >>>> Some high-level statements that might help to put my view into
> >> >>>> perspective:
> >> >>>>
> >> >>>> I want to use ML to identify patterns and code features that we
> >> >>>> can check for using common techniques but when we base our
> >> >>>> decision
> >> making
> >> >>>> on these patterns or features we achieve better compile-time,
> >> code-size,
> >> >>>> and/or performance. I want to use ML to identify shortcomings
> >> >>>> in our existing heuristics, e.g. transformation cut-off values
> >> >>>> or pass schedules. This could also mean to identify alternative
> >> >>>> (combination of) values that perform substantially better (on
> >> >>>> some inputs).
> >> >>>>
> >> >>>>
> >> >>>> > Not sure how to proceed from here. Hence my email to this
> >> >>>> > list.
> >> >>>> Please let
> >> >>>> > me know.
> >> >>>>
> >> >>>> The email to the list was a great first step. The next one
> >> >>>> usually
> >> is to
> >> >>>> setup an LLVM development and testing environment, thus LLVM +
> >> >>>> Clang
> >> +
> >> >>>> LLVM-Test Suite that you can use. It is also advised to work on
> >> >>>> a
> >> small
> >> >>>> task before the GSoC to get used to the LLVM development.
> >> >>>>
> >> >>>> I don't have a really small ML "coding" task handy right now
> >> >>>> but the project is more about experiments anyway. To get some
> >> >>>> LLVM
> >> development
> >> >>>> experience we can just take a small task in the IPO Attributor
> >> >>>> pass.
> >> >>>>
> >> >>>> One thing we need and we don't have is data. The Attributor is
> >> >>>> a fixpoint iteration framework so the number of iterations is
> >> >>>> pretty integral part. We have a statistics counter to determine
> >> >>>> if the
> >> number
> >> >>>> required was higher than the given threshold but not one to
> >> >>>> determine the maximum iteration count required during
> >> >>>> compilation. It would be great if you could add that, thus a
> >> >>>> statistics counter that shows how many iterations where
> >> >>>> required until a fixpoint was found across all invocations of
> >> >>>> the Attributor. Does this make sense? Let me know what you
> >> >>>> think and feel free to ask questions via email or on IRC.
> >> >>>>
> >> >>>> Cheers, Johannes
> >> >>>>
> >> >>>> P.S. Check out the coding style guide and the how to contribute
> >> guide!
> >> >>>>
> >> >>>>
> >> >>>> > Thank you Shiva Badruswamy shivastanford at gmail.com
> >> >>>> >
> >> >>>> >
> >> >>>> > _______________________________________________ LLVM
> >> >>>> > Developers mailing list llvm-dev at lists.llvm.org
> >> >>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >> >>>>
> >> >>>>
> >> >>
> >> >
> >>
> >>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200331/c1a8b964/attachment-0001.html>
More information about the llvm-dev
mailing list