[llvm-dev] [RFC] Move py-mlir-release to new top-level repo in the LLVM org

Stella Laurenzo via llvm-dev llvm-dev at lists.llvm.org
Thu Feb 11 20:46:58 PST 2021


I forgot to include an update on this: A month ago, Tom and I discussed on
Discord and thought that it would be fine to implement this support in the
monorepo with GitHub Actions (vs in a new repo).

On Thu, Jan 7, 2021 at 6:08 PM Stella Laurenzo <stellaraccident at gmail.com>
wrote:

>
>
> On Thu, Jan 7, 2021 at 5:05 PM Tom Stellard <tstellar at redhat.com> wrote:
>
>> On 1/7/21 3:17 PM, Stella Laurenzo wrote:
>> >
>> >
>> > On Thu, Jan 7, 2021 at 2:40 PM Tom Stellard <tstellar at redhat.com
>> > <mailto:tstellar at redhat.com>> wrote:
>> >
>> >     On 1/7/21 10:55 AM, Stella Laurenzo via llvm-dev wrote:
>> >      > Hi folks, I would like to propose that we create a new top-level
>> >     repo in
>> >      > the LLVM organization for organizing the Python MLIR Releases
>> (both
>> >      > daily and official numbered releases, whenever we are ready for
>> >     such a
>> >      > thing) and corresponding pushes to package repositories, etc.
>> >      >
>> >
>> >     For those of use that are unfamiliar, can you explain what the
>> "Python
>> >     MLIR Releases" are?
>> >
>> >
>> > Sure: They are the python wheels and source distributions for the [MLIR
>> > Python Bindings](https://mlir.llvm.org/docs/Bindings/Python/). The key
>> > is that we do them in concordance with how Python packages get released
>> > and push them through standard channels for deployment, and this
>> > involves some gymnastics (of which, what I have will grow in some
>> > complexity as we do this, based on the experience of other projects).
>> > They basically include everything such that if you do a "pip install
>> > mlir" you get a working package that is able to build and compile MLIR
>> > based IR in a variety of forms. An ancillary function of them is to
>> > enable downstream Python based projects to extend the system, so it
>> > entails distributing enough headers and libraries to make this feasible.
>> >
>>
>> Ok, so it's this python code: llvm-project/mlir/lib/Bindings/Python ?
>>
>> >
>> >      > I have prototyped such a release process in a personal repo:
>> >      > https://github.com/stellaraccident/mlir-py-release
>> >      >
>> >      > Additional development on that release process is currently
>> >     blocked on
>> >      > more work on the shared library organization in LLVM (discussed
>> here
>> >      >
>> >     https://lists.llvm.org/pipermail/llvm-dev/2021-January/147567.html
>> and
>> >      > being worked on independently) but it is useful as is and a
>> >     reasonable
>> >      > starting point for further work.
>> >      >
>> >      > I would propose that we just fork my current repo into a new one
>> >     in the
>> >      > LLVM organization and then take the necessary steps to get
>> >      > credentials/permissions/secrets set up in the new context.
>> >      >
>> >      > Some answers to questions that may come up:
>> >      >
>> >      >   * *Why should this be a repo separate from llvm-project? *These
>> >     kind
>> >      >     of automation repos tend to have a lot of "garbage" commits
>> >     that I
>> >      >     think is best if they do not pollute the main repo (and also
>> >     don't
>> >      >     face contention on automatic jobs bumping things, etc). They
>> also
>> >      >     tend to require special permissions and secrets that we will
>> >     want to
>> >      >     more tightly control. They also make use of other GitHub
>> features
>> >      >     that it seems like we would like not polluting the main
>> >     development
>> >      >     flow ("Releases" tab, Actions, etc). Also, this is the kind
>> >     of thing
>> >      >     that tends to get revised en-masse periodically, and again,
>> >     it would
>> >      >     be good to not pollute the monorepo.
>> >
>> >     There really aren't many files in this repo, do you anticipate it
>> >     growing significantly?
>> >
>> >
>> > Not terribly so. Just from some personal experience, the ways things
>> are
>> > done for Python packaging are somewhat... esoteric... from a normal C++
>> > build flow and necessitate certain directory layouts and such that I
>> > felt were better left to their own thing (it is something that you want
>> > to do exactly as everyone else does it).
>> >
>> >
>> >      >   * *Why not land this in llvm-zorg? *llvm-zorg claims to be for
>> >     "LLVM
>> >      >     Testing Infrastructure" and seems well scoped to that
>> statement.
>> >      >     What I am managing above is periodic, automated release
>> tooling
>> >      >     based on open-source CI systems (currently GitHub Actions),
>> which
>> >      >     are fairly standardized across the Python releasing
>> >     community, easy
>> >      >     to set up, etc.
>> >
>> >     llvm-zorg also handles generating the websites.  My personal
>> opinion is
>> >     that it would be OK to try to do this in llvm-zorg, but you're
>> probably
>> >     better off asking Galina about that.  I guess the downside of using
>> >     llvm-zorg is you don't get the releases tab.
>> >
>> >
>> > That is a good reason to put it there. One of the actions that is not
>> > implemented yet is for generating API docs (which is done post
>> > build/install for the Python side, because it introspects a running
>> system).
>> >
>> > The releases page is actually pretty important. For snapshot builds,
>> > python's pip can just scrape it directly for published, installable
>> > artifacts and without it, we would need to roll our own place to stash
>> > such things.
>> >
>>
>> Could you have the GitHub action directly submit the package to pip
>> rather than having it scrape the release page?  If we could, would there
>> be any reason to have a release page?  Would users be downloading from
>> the release page or from pip?
>>
>
> My team's preference while we are very pre-release like we are is to not
> pollute the pip namespace until we're sure we have what we want. Deploying
> to the local project's release page is a good way to have some people be
> able to use it earlier but also still have an appropriate barrier to entry
> that matches where its at in the life-cycle.
>
> Personal preference.
>
> Some projects end up always deploying from their release page because they
> can't comply with PyPi policies (usually around distro version,
> dependencies, etc), but I've charted this out and think we will stay
> compliant.
>
>
>> >
>> >     Why did you choose to write the checkout_repo.py script in python
>> >     rather
>> >     than using the GitHub checkout action, or writing your own custom
>> >     action?
>> >
>> >
>> > Good question - that was a limitation in my knowledge at the time (need
>> > to source the version from a file). Consider that a TODO to eliminate.
>> >
>>
>> If you need anything more complicated than some of the builtin actions,
>> you an add them to the llvm/actions repo.
>>
>
> Nice, thanks.
>
>
>>
>> -Tom
>> >
>> >      >   * *What ultimately will the code in this repo do?*
>> >      >       o Have periodic GitHub actions to select new LLVM
>> revisions and
>> >      >         schedule daily/snapshot releases.
>> >
>> >     Do you have any idea of much of the GitHub actions resources this
>> would
>> >     use?  e.g. how many hours per week per Operating System?
>> >
>> >
>> > Currently, each snapshot builds for about 30m on the free 2-core setups
>> > per OS. However, this isn't presently compiling as much of LLVM that
>> > will ultimately be needed. I have automation for another project where
>> > we do build more/most of the backends as well, and that builds for
>> > 1.25-1.5 hours per snapshot (and builds a fair bit more things
>> unrelated
>> > to LLVM, so just an upper bound estimate). On my other project, I found
>> > that each minor python version added (of which, there are probably ~4
>> > LTS at any given time) added about 1min to each build.
>> >
>> > So if we are doing 2 snapshots a day and being conservative, 28
>> > hours/week/OS?
>> >
>> > I'm not running tests yet, so that will come with some costs. We will
>> > probably choose to run just the python bindings tests per python
>> version
>> > (which are really cheap) and then run the full regression suite once
>> per OS.
>> >
>> >
>> >      >       o Have manual actions for triggering official, numbered
>> >     releases.
>> >      >       o Facilities for building Python wheels for PyPi and house
>> any
>> >      >         additional metadata/automation needed for anaconda.
>> >      >       o Builds releases for all supported operating systems
>> >     (currently
>> >      >         Linux/CentOS7/manylinux2014, MacOS, and Windows) and
>> >     supported
>> >      >         Python versions (currently 3.6, 3.7, 3.8, 3.9).
>> >      >       o Publish release artifacts on the Releases tab for
>> >     daily/snapshot
>> >      >         releases.
>> >      >       o Provide a stable reference point for downstream projects
>> that
>> >      >         extend MLIR-Python and need to produce version-matched
>> >     artifacts
>> >      >         of their own.
>> >      >   * *Could this graduate to be more than "MLIR" python?* Maybe. I
>> >     chose
>> >      >     the name because that is what I am focused on and didn't
>> want to
>> >      >     grab too much land. But there is nothing stopping this from
>> >     becoming
>> >      >     automation for general LLVM monorepo+incubator Python
>> releasing.
>> >
>> >     I think it would be great to generalize this.  I would also like to
>> >     automate parts the main LLVM release, and there seems to be some
>> >     overlap
>> >     with what you are doing.
>> >
>> >
>> > Agreed. I actually found this quite easy to prototype. I think I spent
>> a
>> > grand total of ~a day on what is there (which isn't done yet, but isn't
>> > super far off). It then took me ~3 days to adapt it to IREE
>> > (https://github.com/google/iree), which is much more complicated (as
>> it
>> > has to build LLVM, a bunch of deps and TensorFlow).
>> >
>> >
>> >     -Tom
>> >
>> >      >   * *What if we don't do this?*
>> >      >       o *Option A:* We keep running this in a private repo with
>> the
>> >      >         disclaimer that is currently at the top: "Note that this
>> is a
>> >      >         prototype of a real MLIR release process being run by a
>> >     member
>> >      >         of the community. These are not official releases of the
>> LLVM
>> >      >         Foundation in any way, and they are likely only going to
>> be
>> >      >         useful to people actually working on LLVM/MLIR until we
>> get
>> >      >         things productionized." We would miss opportunities for
>> >      >         convergence with other projects and would cause things to
>> >     fragment.
>> >      >       o *Option B: *We only publish Python bindings in official
>> LLVM
>> >      >         release packages, and only for the Python version they
>> >     are built
>> >      >         with. We don't release Python binaries through normal
>> package
>> >      >         management channels.
>> >
>> >
>> >      >
>> >      > Opinions?
>> >      > - Stella
>> >      >
>> >      > _______________________________________________
>> >      > LLVM Developers mailing list
>> >      > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> >      > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >      >
>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210211/0b6bc24d/attachment-0001.html>


More information about the llvm-dev mailing list