[flang-dev] [cfe-dev] Bugzilla migration is stopped again

Anton Korobeynikov via flang-dev flang-dev at lists.llvm.org
Tue Dec 7 08:03:24 PST 2021


> I noticed this yesterday with the existing test migration: compare
> https://bugs.llvm.org/show_bug.cgi?id=52598
> versus
> https://github.com/llvm/llvm-bugzilla-archive/issues/52598
>
> The current script seems to be forgetting that GitHub issues use Markdown, and so every existing Bugzilla comment needs to be wrapped in triple-backticks to preserve its semantics.
No it is not. This was discussed at one of the roundtables and it was
decided that the conversion will be done verbatim. If necessary for
some issues it could be converted to proper Markdown by the reporters.

> Anton: I see about 35,000 issues in
> https://github.com/llvm/llvm-bugzilla-archive/issues
> but only 228 (i.e. essentially none, presumably just historical noise from newbie GitHub users) in
> https://github.com/llvm/llvm-project/issues
> Where are the 13,000 issues you are saying have already been migrated?
You cannot see them as issues are currently disabled in llvm-project
repo to keep the things intact while we are waiting for suggestions
from GitHub engineers. What you're seeing are pull requests (note the
header).

> IIUC, it's very fortunate that there aren't yet 13,000 issues in https://github.com/llvm/llvm-project/issues
They are, see above.

> Only once the whole migration has been tested end-to-end on a test repo, would I recommend starting the migration into the production repo https://github.com/llvm/llvm-project.

> Those make it clear that someone's done a little bit of work to script this stuff; but the Google Doc also makes it clear that there is a long way to go to accomplish a "deploy plan": someone needs to take that English description and turn it into code (Python or even Bash or whatever) that
Do you want me to bash script the work which is done by GitHub engineers?

> Step 1, getting the XML files from Bugzilla, turns out to be super easy because there's a public API for that:
> https://github.com/Quuxplusone/BugzillaToGithub
> Step 3, transforming XML to GitHub's JSON schema, requires knowing what GitHub's schema looks like. I've found
> https://gist.github.com/jonmagic/5282384165e0f86ef105#start-an-issue-import
> although it's not real clear what the schema is or if that even still works (I haven't tried yet). Also, there seems to be no way for one GitHub user to create a comment or issue putatively authored by some other GitHub user. (Which certainly makes sense.)
Well, the current approach we're using certainly handles this well.
Though, I would certainly like to see the migrated 10k issues at
https://github.com/Quuxplusone/ at the end of the week as you promised
and compare with what we already have in the llvm-bugzilla-archive.

 So this would result in issues and comments filed by "LLVM Import
Bot" or whatever... but I think that's fine, and might even avoid some
issues that you'd have otherwise, with scenarios like "Joe User
created his GitHub account in 2015, but was making comments on LLVM
issues back in 2012."

> Vice versa, btw, you've currently got some issues being incorrectly imported with the reporter listed in the issue summary itself as "LLVM Bugzilla Contributor"; e.g. this one from Chris Burel.
Chris Burel did not fill the survey therefore the data is anonymised.

-- 
With best regards, Anton Korobeynikov
Department of Statistical Modelling, Saint Petersburg State University


More information about the flang-dev mailing list