[PATCH] D84233: [lit] Escape ANSI control character in xunit output
Alexander Richardson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 24 01:32:29 PDT 2020
arichardson added a comment.
In D84233#2170305 <https://reviews.llvm.org/D84233#2170305>, @yln wrote:
> @arichardson: can you double-check that this workaround is still needed?
> Do we understand the semantics of CDATA blocks? I was under the impression we use it here to avoid problems like this.
>
> Anyways, I am fine with this. Adding Joel as a second reviewer to get his feedback before accepting.
I believe CDATA just avoids the need for escape XML special characters. However, characters 0-0x20 (with the exception of \t \r an \n are not valid anywhere in the document according to the XML spec (https://www.w3.org/TR/xml/#charsets):
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
XML 1.1 seems to relax that an allow everything except NUL: https://www.w3.org/TR/xml11/#charsets
Maybe specifying version 1.1 for the XUnit output would make the Java parsers happy again, but escaping ANSI control characters might also be useful if you open the report XML file in a text editor.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D84233/new/
https://reviews.llvm.org/D84233
More information about the llvm-commits
mailing list