[llvm-dev] Encoding/formating problems in migrated GitHub issues

Anton Korobeynikov via llvm-dev llvm-dev at lists.llvm.org
Sun Nov 28 12:33:30 PST 2021


Hello

Thank you for your comments.

First of all, our bugzilla contains 17k+ open issues. We will
certainly make them writeable after the conversion.

> https://github.com/llvm/llvm-bugzilla-archive/issues/37813
> In the title some HTML encoded character was introduced changing a
> simple "@private" from the original report to "@​private".
As you can imagine, LLVM IR contains lots of things like "@foo", "#1"
and "!123". All these create references to other github objects –
issues or (even worse) github users. There is no way we can "escape"
such sequences. The only viable solution is to insert a zero-width
space symbol between e.g. "#" and number. However, it turns out that
GitHub markdown engine renders such things differently – it ignores
such symbols in e.g. titles (however, will happily notify github user
@private in such case).

We tried to limit the amount of sanitizing rewriting we're doing. But
still there will be false positives and negatives.

> https://github.com/llvm/llvm-bugzilla-archive/issues/47167
> Since GitHub uses Markdown by default the "__VA_ARGS__" in the sample
> code is now bold and the code is no longer valid.
This is expected, yes, as GitHub is all markdown and there is no way
one could disable the markup. Copying everything into a code block is
not an option as well, as it does not scroll and word-wrap properly,
etc.

> https://github.com/llvm/llvm-bugzilla-archive/issues/48284
> The sample code in this is all messed up since it is treated as
> Markdown. This seems to be true for several of my reported issues.
See above.

-- 
With best regards, Anton Korobeynikov
Department of Statistical Modelling, Saint Petersburg State University


More information about the llvm-dev mailing list