[PATCH] D133476: [automation] Add scripts to automate GitHub projects

Konrad Wilhelm Kleine via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 12 01:52:18 PDT 2022


kwk requested changes to this revision.
kwk added a subscriber: serge-sans-paille.
kwk added a comment.
This revision now requires changes to proceed.

@thieta as much as I like the effort you've made here. I'd like to propose a standalone GraphQL library. You're adding lots of code that is hard to trust without tests. That is why I started to create a library with tests and a release process to pypi. I think much of the things you do here are so generic that we can integrate them into the library at will.

https://github.com/kwk/ghgql

So far I will try to see if we can add the queries you've written nicely into "my" library.



================
Comment at: llvm/utils/git/github-project.py:14-36
+query($org: String!, $number: Int!) {
+  organization(login: $org) {
+    projectV2(number: $number) {
+      id
+      fields(first:20) {
+        nodes {
+          ... on ProjectV2Field {
----------------
I would love to see those queries in their own files. Then we can call them manually and verify them using tools etc. But I see that this query itself comes from here: https://docs.github.com/en/issues/planning-and-tracking-with-projects/automating-your-project/automating-projects-using-actions#example-workflow-authenticating-with-a-github-app, right?


================
Comment at: llvm/utils/git/github-project.py:78
+ADD_ISSUE_TO_PROJECT = """
+mutation AddToProject($project: ID!, $issue: ID!) {
+  addProjectV2ItemById(input: { projectId: $project, contentId: $issue }) {
----------------



================
Comment at: llvm/utils/git/github-project.py:154
+
+session = requests.session()
+
----------------
This function is deprecated but still works and requires minimal changes. I'm doing `from requests import Session` and then `self.__session = Session()`. For example. BUT: I found out when running my tests, that the session object is never closed. That's why I've implemented my `GithubGraphQL` class as a context manger so that you can write: 

```lang=python
with ghgql.GithubGraphQL() as ghapi:
    ghapi.query_from_file("foo.gql")
```

and the session stored in `ghapi` will be closed.


================
Comment at: llvm/utils/git/github-project.py:157-159
+def graphql(query: str, variables: Dict[str, Union[int, str]]) -> Dict[Any, Any]:
+  req = session.post("https://api.github.com/graphql", json={"query": query, "variables": variables})
+  req.raise_for_status()
----------------
@thieta please take a look at the function `run_graphql_query` and the comments for it here: https://reviews.llvm.org/D119412. We've had some discussions on GraphQL already. And I believe we agreed to outsource the queries in the end. At least we discussed to have them clean and use the `variables` array which you've type here quite nicely.

Anyway I would love to see a sharing of code when it comes to GraphQL. Especially when it comes to making proper calls using an API in Python.

You're potentially raising an `HttpError` when calling `raise_for_status`. Is that on purpose or do you want to handle errors locally here? 

When you run the `PROJECT_FIELDS` query in the GraphQL explorer (https://docs.github.com/en/graphql/overview/explorer) you're being greeted with this error:

```
Although you appear to have the correct authorization credentials, the `llvm` organization has enabled OAuth App access restrictions, meaning that data access to third-parties is limited. For more information on these restrictions, including how to enable this app, visit https://docs.github.com/articles/restricting-access-to-your-organization-s-data/
```

That is why I believe you should specify a token when making a request to the GraphQL API.

EDIT: I saw that you're setting the bearer `token` to the global `session` object directly when parsing the arguments. In the spirit of sharing code for GraphQL usage, can we please just make a class with some nice defaults? Here's an idea for such a class. Then we can get rid of global objects and alike: https://github.com/kwk/ghgql.

I think this incorporates all of the stuff from @tstellar and myself as well as your code and additional tests. I've decided to always return a `ghgql.Result` even in case of errors. See the `tests/` folder for usage information. This way we don't have to mess with exceptions so much and you can call the method on the returned object. All that `ghgql.Result` does is to wrap `dict`. What do you think? Can we get this in as a separate patch maybe? I mean not the python package but maybe it makes sense to outsource it like I did and only keep the code in the llvm tree. @serge-sans-paille you do a lot for `python-lit` if I'm not mistaken. Maybe you can tell us more about the pros and cons.


================
Comment at: llvm/utils/git/github-project.py:162-170
+  if "errors" in jsondata:
+    print("Something went wrong:")
+    pprint(jsondata["errors"])
+    raise RuntimeError
+
+  if not "data" in jsondata:
+    print("No data in return - instead we got:")
----------------
I've designed the `ghgql.Result` class to only raise a `RuntimeError` if a user tries to access and element from a result but there are errors present (see https://github.com/kwk/ghgql/blob/main/src/ghgql/result.py#L36).


================
Comment at: llvm/utils/git/github-project.py:174-188
+# Helper function to get data from a deeply nested dict
+# gv(dict, "status.item.id") -> dict["status"]["item"]["id"]
+# plus that it handles if something is not there.
+def gv(data_dict: Dict, value_key: str, def_value:Any = None) -> Any:
+  keys = value_key.split(".")
+
+  cd = data_dict
----------------
I like this idea. You can find this idea in my `ghgql.Result` class here: https://github.com/kwk/ghgql/blob/main/src/ghgql/result.py#L18


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133476/new/

https://reviews.llvm.org/D133476



More information about the llvm-commits mailing list