[llvm-dev] [RFC] Move py-mlir-release to new top-level repo in the LLVM org

Tom Stellard via llvm-dev llvm-dev at lists.llvm.org
Thu Jan 7 17:05:43 PST 2021


On 1/7/21 3:17 PM, Stella Laurenzo wrote:
> 
> 
> On Thu, Jan 7, 2021 at 2:40 PM Tom Stellard <tstellar at redhat.com 
> <mailto:tstellar at redhat.com>> wrote:
> 
>     On 1/7/21 10:55 AM, Stella Laurenzo via llvm-dev wrote:
>      > Hi folks, I would like to propose that we create a new top-level
>     repo in
>      > the LLVM organization for organizing the Python MLIR Releases (both
>      > daily and official numbered releases, whenever we are ready for
>     such a
>      > thing) and corresponding pushes to package repositories, etc.
>      >
> 
>     For those of use that are unfamiliar, can you explain what the "Python
>     MLIR Releases" are?
> 
> 
> Sure: They are the python wheels and source distributions for the [MLIR 
> Python Bindings](https://mlir.llvm.org/docs/Bindings/Python/). The key 
> is that we do them in concordance with how Python packages get released 
> and push them through standard channels for deployment, and this 
> involves some gymnastics (of which, what I have will grow in some 
> complexity as we do this, based on the experience of other projects). 
> They basically include everything such that if you do a "pip install 
> mlir" you get a working package that is able to build and compile MLIR 
> based IR in a variety of forms. An ancillary function of them is to 
> enable downstream Python based projects to extend the system, so it 
> entails distributing enough headers and libraries to make this feasible.
> 

Ok, so it's this python code: llvm-project/mlir/lib/Bindings/Python ?

> 
>      > I have prototyped such a release process in a personal repo:
>      > https://github.com/stellaraccident/mlir-py-release
>      >
>      > Additional development on that release process is currently
>     blocked on
>      > more work on the shared library organization in LLVM (discussed here
>      >
>     https://lists.llvm.org/pipermail/llvm-dev/2021-January/147567.html and
>      > being worked on independently) but it is useful as is and a
>     reasonable
>      > starting point for further work.
>      >
>      > I would propose that we just fork my current repo into a new one
>     in the
>      > LLVM organization and then take the necessary steps to get
>      > credentials/permissions/secrets set up in the new context.
>      >
>      > Some answers to questions that may come up:
>      >
>      >   * *Why should this be a repo separate from llvm-project? *These
>     kind
>      >     of automation repos tend to have a lot of "garbage" commits
>     that I
>      >     think is best if they do not pollute the main repo (and also
>     don't
>      >     face contention on automatic jobs bumping things, etc). They also
>      >     tend to require special permissions and secrets that we will
>     want to
>      >     more tightly control. They also make use of other GitHub features
>      >     that it seems like we would like not polluting the main
>     development
>      >     flow ("Releases" tab, Actions, etc). Also, this is the kind
>     of thing
>      >     that tends to get revised en-masse periodically, and again,
>     it would
>      >     be good to not pollute the monorepo.
> 
>     There really aren't many files in this repo, do you anticipate it
>     growing significantly?
> 
> 
> Not terribly so. Just from some personal experience, the ways things are 
> done for Python packaging are somewhat... esoteric... from a normal C++ 
> build flow and necessitate certain directory layouts and such that I 
> felt were better left to their own thing (it is something that you want 
> to do exactly as everyone else does it).
> 
> 
>      >   * *Why not land this in llvm-zorg? *llvm-zorg claims to be for
>     "LLVM
>      >     Testing Infrastructure" and seems well scoped to that statement.
>      >     What I am managing above is periodic, automated release tooling
>      >     based on open-source CI systems (currently GitHub Actions), which
>      >     are fairly standardized across the Python releasing
>     community, easy
>      >     to set up, etc.
> 
>     llvm-zorg also handles generating the websites.  My personal opinion is
>     that it would be OK to try to do this in llvm-zorg, but you're probably
>     better off asking Galina about that.  I guess the downside of using
>     llvm-zorg is you don't get the releases tab.
> 
> 
> That is a good reason to put it there. One of the actions that is not 
> implemented yet is for generating API docs (which is done post 
> build/install for the Python side, because it introspects a running system).
> 
> The releases page is actually pretty important. For snapshot builds, 
> python's pip can just scrape it directly for published, installable 
> artifacts and without it, we would need to roll our own place to stash 
> such things.
> 

Could you have the GitHub action directly submit the package to pip 
rather than having it scrape the release page?  If we could, would there 
be any reason to have a release page?  Would users be downloading from 
the release page or from pip?
> 
>     Why did you choose to write the checkout_repo.py script in python
>     rather
>     than using the GitHub checkout action, or writing your own custom
>     action?
> 
> 
> Good question - that was a limitation in my knowledge at the time (need 
> to source the version from a file). Consider that a TODO to eliminate.
> 

If you need anything more complicated than some of the builtin actions, 
you an add them to the llvm/actions repo.

-Tom
> 
>      >   * *What ultimately will the code in this repo do?*
>      >       o Have periodic GitHub actions to select new LLVM revisions and
>      >         schedule daily/snapshot releases.
> 
>     Do you have any idea of much of the GitHub actions resources this would
>     use?  e.g. how many hours per week per Operating System?
> 
> 
> Currently, each snapshot builds for about 30m on the free 2-core setups 
> per OS. However, this isn't presently compiling as much of LLVM that 
> will ultimately be needed. I have automation for another project where 
> we do build more/most of the backends as well, and that builds for 
> 1.25-1.5 hours per snapshot (and builds a fair bit more things unrelated 
> to LLVM, so just an upper bound estimate). On my other project, I found 
> that each minor python version added (of which, there are probably ~4 
> LTS at any given time) added about 1min to each build.
> 
> So if we are doing 2 snapshots a day and being conservative, 28 
> hours/week/OS?
> 
> I'm not running tests yet, so that will come with some costs. We will 
> probably choose to run just the python bindings tests per python version 
> (which are really cheap) and then run the full regression suite once per OS.
> 
> 
>      >       o Have manual actions for triggering official, numbered
>     releases.
>      >       o Facilities for building Python wheels for PyPi and house any
>      >         additional metadata/automation needed for anaconda.
>      >       o Builds releases for all supported operating systems
>     (currently
>      >         Linux/CentOS7/manylinux2014, MacOS, and Windows) and
>     supported
>      >         Python versions (currently 3.6, 3.7, 3.8, 3.9).
>      >       o Publish release artifacts on the Releases tab for
>     daily/snapshot
>      >         releases.
>      >       o Provide a stable reference point for downstream projects that
>      >         extend MLIR-Python and need to produce version-matched
>     artifacts
>      >         of their own.
>      >   * *Could this graduate to be more than "MLIR" python?* Maybe. I
>     chose
>      >     the name because that is what I am focused on and didn't want to
>      >     grab too much land. But there is nothing stopping this from
>     becoming
>      >     automation for general LLVM monorepo+incubator Python releasing.
> 
>     I think it would be great to generalize this.  I would also like to
>     automate parts the main LLVM release, and there seems to be some
>     overlap
>     with what you are doing.
> 
> 
> Agreed. I actually found this quite easy to prototype. I think I spent a 
> grand total of ~a day on what is there (which isn't done yet, but isn't 
> super far off). It then took me ~3 days to adapt it to IREE 
> (https://github.com/google/iree), which is much more complicated (as it 
> has to build LLVM, a bunch of deps and TensorFlow).
> 
> 
>     -Tom
> 
>      >   * *What if we don't do this?*
>      >       o *Option A:* We keep running this in a private repo with the
>      >         disclaimer that is currently at the top: "Note that this is a
>      >         prototype of a real MLIR release process being run by a
>     member
>      >         of the community. These are not official releases of the LLVM
>      >         Foundation in any way, and they are likely only going to be
>      >         useful to people actually working on LLVM/MLIR until we get
>      >         things productionized." We would miss opportunities for
>      >         convergence with other projects and would cause things to
>     fragment.
>      >       o *Option B: *We only publish Python bindings in official LLVM
>      >         release packages, and only for the Python version they
>     are built
>      >         with. We don't release Python binaries through normal package
>      >         management channels.
> 
> 
>      >
>      > Opinions?
>      > - Stella
>      >
>      > _______________________________________________
>      > LLVM Developers mailing list
>      > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>      > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>      >
> 



More information about the llvm-dev mailing list