[llvm-dev] [RFC] Move py-mlir-release to new top-level repo in the LLVM org
Tom Stellard via llvm-dev
llvm-dev at lists.llvm.org
Thu Jan 7 17:05:43 PST 2021
On 1/7/21 3:17 PM, Stella Laurenzo wrote:
>
>
> On Thu, Jan 7, 2021 at 2:40 PM Tom Stellard <tstellar at redhat.com
> <mailto:tstellar at redhat.com>> wrote:
>
> On 1/7/21 10:55 AM, Stella Laurenzo via llvm-dev wrote:
> > Hi folks, I would like to propose that we create a new top-level
> repo in
> > the LLVM organization for organizing the Python MLIR Releases (both
> > daily and official numbered releases, whenever we are ready for
> such a
> > thing) and corresponding pushes to package repositories, etc.
> >
>
> For those of use that are unfamiliar, can you explain what the "Python
> MLIR Releases" are?
>
>
> Sure: They are the python wheels and source distributions for the [MLIR
> Python Bindings](https://mlir.llvm.org/docs/Bindings/Python/). The key
> is that we do them in concordance with how Python packages get released
> and push them through standard channels for deployment, and this
> involves some gymnastics (of which, what I have will grow in some
> complexity as we do this, based on the experience of other projects).
> They basically include everything such that if you do a "pip install
> mlir" you get a working package that is able to build and compile MLIR
> based IR in a variety of forms. An ancillary function of them is to
> enable downstream Python based projects to extend the system, so it
> entails distributing enough headers and libraries to make this feasible.
>
Ok, so it's this python code: llvm-project/mlir/lib/Bindings/Python ?
>
> > I have prototyped such a release process in a personal repo:
> > https://github.com/stellaraccident/mlir-py-release
> >
> > Additional development on that release process is currently
> blocked on
> > more work on the shared library organization in LLVM (discussed here
> >
> https://lists.llvm.org/pipermail/llvm-dev/2021-January/147567.html and
> > being worked on independently) but it is useful as is and a
> reasonable
> > starting point for further work.
> >
> > I would propose that we just fork my current repo into a new one
> in the
> > LLVM organization and then take the necessary steps to get
> > credentials/permissions/secrets set up in the new context.
> >
> > Some answers to questions that may come up:
> >
> > * *Why should this be a repo separate from llvm-project? *These
> kind
> > of automation repos tend to have a lot of "garbage" commits
> that I
> > think is best if they do not pollute the main repo (and also
> don't
> > face contention on automatic jobs bumping things, etc). They also
> > tend to require special permissions and secrets that we will
> want to
> > more tightly control. They also make use of other GitHub features
> > that it seems like we would like not polluting the main
> development
> > flow ("Releases" tab, Actions, etc). Also, this is the kind
> of thing
> > that tends to get revised en-masse periodically, and again,
> it would
> > be good to not pollute the monorepo.
>
> There really aren't many files in this repo, do you anticipate it
> growing significantly?
>
>
> Not terribly so. Just from some personal experience, the ways things are
> done for Python packaging are somewhat... esoteric... from a normal C++
> build flow and necessitate certain directory layouts and such that I
> felt were better left to their own thing (it is something that you want
> to do exactly as everyone else does it).
>
>
> > * *Why not land this in llvm-zorg? *llvm-zorg claims to be for
> "LLVM
> > Testing Infrastructure" and seems well scoped to that statement.
> > What I am managing above is periodic, automated release tooling
> > based on open-source CI systems (currently GitHub Actions), which
> > are fairly standardized across the Python releasing
> community, easy
> > to set up, etc.
>
> llvm-zorg also handles generating the websites. My personal opinion is
> that it would be OK to try to do this in llvm-zorg, but you're probably
> better off asking Galina about that. I guess the downside of using
> llvm-zorg is you don't get the releases tab.
>
>
> That is a good reason to put it there. One of the actions that is not
> implemented yet is for generating API docs (which is done post
> build/install for the Python side, because it introspects a running system).
>
> The releases page is actually pretty important. For snapshot builds,
> python's pip can just scrape it directly for published, installable
> artifacts and without it, we would need to roll our own place to stash
> such things.
>
Could you have the GitHub action directly submit the package to pip
rather than having it scrape the release page? If we could, would there
be any reason to have a release page? Would users be downloading from
the release page or from pip?
>
> Why did you choose to write the checkout_repo.py script in python
> rather
> than using the GitHub checkout action, or writing your own custom
> action?
>
>
> Good question - that was a limitation in my knowledge at the time (need
> to source the version from a file). Consider that a TODO to eliminate.
>
If you need anything more complicated than some of the builtin actions,
you an add them to the llvm/actions repo.
-Tom
>
> > * *What ultimately will the code in this repo do?*
> > o Have periodic GitHub actions to select new LLVM revisions and
> > schedule daily/snapshot releases.
>
> Do you have any idea of much of the GitHub actions resources this would
> use? e.g. how many hours per week per Operating System?
>
>
> Currently, each snapshot builds for about 30m on the free 2-core setups
> per OS. However, this isn't presently compiling as much of LLVM that
> will ultimately be needed. I have automation for another project where
> we do build more/most of the backends as well, and that builds for
> 1.25-1.5 hours per snapshot (and builds a fair bit more things unrelated
> to LLVM, so just an upper bound estimate). On my other project, I found
> that each minor python version added (of which, there are probably ~4
> LTS at any given time) added about 1min to each build.
>
> So if we are doing 2 snapshots a day and being conservative, 28
> hours/week/OS?
>
> I'm not running tests yet, so that will come with some costs. We will
> probably choose to run just the python bindings tests per python version
> (which are really cheap) and then run the full regression suite once per OS.
>
>
> > o Have manual actions for triggering official, numbered
> releases.
> > o Facilities for building Python wheels for PyPi and house any
> > additional metadata/automation needed for anaconda.
> > o Builds releases for all supported operating systems
> (currently
> > Linux/CentOS7/manylinux2014, MacOS, and Windows) and
> supported
> > Python versions (currently 3.6, 3.7, 3.8, 3.9).
> > o Publish release artifacts on the Releases tab for
> daily/snapshot
> > releases.
> > o Provide a stable reference point for downstream projects that
> > extend MLIR-Python and need to produce version-matched
> artifacts
> > of their own.
> > * *Could this graduate to be more than "MLIR" python?* Maybe. I
> chose
> > the name because that is what I am focused on and didn't want to
> > grab too much land. But there is nothing stopping this from
> becoming
> > automation for general LLVM monorepo+incubator Python releasing.
>
> I think it would be great to generalize this. I would also like to
> automate parts the main LLVM release, and there seems to be some
> overlap
> with what you are doing.
>
>
> Agreed. I actually found this quite easy to prototype. I think I spent a
> grand total of ~a day on what is there (which isn't done yet, but isn't
> super far off). It then took me ~3 days to adapt it to IREE
> (https://github.com/google/iree), which is much more complicated (as it
> has to build LLVM, a bunch of deps and TensorFlow).
>
>
> -Tom
>
> > * *What if we don't do this?*
> > o *Option A:* We keep running this in a private repo with the
> > disclaimer that is currently at the top: "Note that this is a
> > prototype of a real MLIR release process being run by a
> member
> > of the community. These are not official releases of the LLVM
> > Foundation in any way, and they are likely only going to be
> > useful to people actually working on LLVM/MLIR until we get
> > things productionized." We would miss opportunities for
> > convergence with other projects and would cause things to
> fragment.
> > o *Option B: *We only publish Python bindings in official LLVM
> > release packages, and only for the Python version they
> are built
> > with. We don't release Python binaries through normal package
> > management channels.
>
>
> >
> > Opinions?
> > - Stella
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
>
More information about the llvm-dev
mailing list