[cfe-dev] Bugzilla migration is stopped again

Arthur O'Dwyer via cfe-dev cfe-dev at lists.llvm.org
Tue Dec 7 07:54:15 PST 2021


On Tue, Dec 7, 2021 at 7:33 AM Anton Korobeynikov via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

>
> > 6) We can edit the comments by hand (can you only edit your own comments
> or can we edit someone else's comments, I'm thinking its only our own based
> on testing I've done with other repos)
> > - isn't this a requirement in order to fix up the "code-blocks"?
>


Yes, only admins can edit everything.
>

I noticed this yesterday with the existing test migration: compare
https://bugs.llvm.org/show_bug.cgi?id=52598
versus
https://github.com/llvm/llvm-bugzilla-archive/issues/52598

The current script seems to be forgetting that GitHub issues use Markdown,
and so every existing Bugzilla comment needs to be wrapped in
triple-backticks to preserve its semantics.
(You could do *cleverer* things, like "don't wrap comments that are only
one line long," but doing anything *less-clever* will be a non-starter.)

> Assuming there is no obvious/immediate fix, Do we have any choice but to
> move ahead with the existing import and fix the comments by hand
> retrospectively (assuming 6)
> This is what I asked GitHub engineers. They essentially asked for yet
> another day to figure out the possible options. My rough estimate that
> at least 5k issues will have broken links.


Anton: I see about 35,000 issues in
https://github.com/llvm/llvm-bugzilla-archive/issues
but only 228 (i.e. essentially none, presumably just historical noise from
newbie GitHub users) in
https://github.com/llvm/llvm-project/issues
Where are the 13,000 issues you are saying have already been migrated?

IIUC, it's *very fortunate* that there aren't yet 13,000 issues in
https://github.com/llvm/llvm-project/issues .  That means that it is still
an option to do a "practice" migration into a test repo — e.g.,
https://github.com/llvm/llvm-bugzilla-archive2 (and then if it works as
intended, you can either "blow away
https://github.com/llvm/llvm-bugzilla-archive and rename
https://github.com/llvm/llvm-bugzilla-archive2 to
https://github.com/llvm/llvm-bugzilla-archive", or "blow away
https://github.com/llvm/llvm-bugzilla-archive and repeat the migration just
to prove it works *reproducibly*".
Only once the whole migration has been tested end-to-end on a test repo,
would I recommend starting the migration into the production repo
https://github.com/llvm/llvm-project.

Thanks for the links to https://github.com/llvm/bugzilla2gitlab/tree/llvm
 and
https://docs.google.com/document/d/1G6DZ6AxzSaOlrtTxoxtqYKnD4Myv40QfKK4wj54y8ms/edit
 .
Those make it clear that someone's done a little bit of work to script this
stuff; but the Google Doc also makes it clear that there is a long way to
go to accomplish a "deploy plan": someone needs to take that English
description and turn it into code (Python or even Bash or whatever) that
can be
(A) reviewed for correctness, without running it
(B) run multiple times with guaranteed same behavior, with no risk that
some human will accidentally forget a step in the middle

Step 1, getting the XML files from Bugzilla, turns out to be super easy
because there's a public API for that:
https://github.com/Quuxplusone/BugzillaToGithub
Step 3, transforming XML to GitHub's JSON schema, requires knowing what
GitHub's schema looks like. I've found
https://gist.github.com/jonmagic/5282384165e0f86ef105#start-an-issue-import
although it's not real clear what the schema is or if that even still works
(I haven't tried yet). Also, there seems to be no way for one GitHub user
to create a comment or issue putatively authored by some other GitHub user.
(Which certainly makes sense.) So this would result in issues and comments
filed by "LLVM Import Bot" or whatever... but I think that's fine, and
might even avoid some issues that you'd have otherwise, with scenarios like
"Joe User created his GitHub account in 2015, but was making comments on
LLVM issues back in 2012."

Vice versa, btw, you've currently got some issues being incorrectly
imported with the reporter listed* in the issue summary itself* as "LLVM
Bugzilla Contributor"; e.g. this one from Chris Burel.
https://github.com/llvm/llvm-bugzilla-archive/issues/52567
It certainly makes sense that you won't have a GitHub *username* for some
people, but you still shouldn't throw away the information about their
human name just because we're migrating from one platform to another.

–Arthur
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20211207/7c8fa681/attachment.html>


More information about the cfe-dev mailing list