[PATCH] D55746: [libcxx] [test] [re.traits] Correct expected values for invalid UTF-8
Michał Górny via Phabricator
reviews at reviews.llvm.org
Mon Dec 17 07:33:26 PST 2018
mgorny updated this revision to Diff 178463.
mgorny retitled this revision from "[libcxx] [test] [re.traits] Remove asserts failing due to invalid UTF-8" to "[libcxx] [test] [re.traits] Correct expected values for invalid UTF-8".
mgorny edited the summary of this revision.
mgorny added a comment.
Very well. I originally wanted to avoid relying on any specific behavior with invalid input but I suppose you're right. Unless I'm misunderstanding the spec, the behavior should be equivalent to `tolower()`, and `tolower()` specifies that the value should be returned unmodified if there's no lowercase representation. Now, I suppose it depends on how you define that but I think it's reasonable to assume that invalid characters have no lowercase representation.
I've tested this behavior on Linux, FreeBSD and NetBSD. I suppose the original change from behavior equivalent to my patch now to the broken behavior we have right now was accidental.
CHANGES SINCE LAST ACTION
@@ -19,9 +19,6 @@
// XFAIL: with_system_cxx_lib=macosx10.7
// XFAIL: with_system_cxx_lib=macosx10.8
-// TODO: investigation needed
-// XFAIL: linux-gnu
@@ -47,7 +44,9 @@
assert(t.translate_nocase('.') == '.');
assert(t.translate_nocase('a') == 'a');
assert(t.translate_nocase('1') == '1');
- assert(t.translate_nocase('\xDA') == '\xFA');
+ // \xDA is initial char of MBS in UTF-8
+ assert(t.translate_nocase('\xDA') == '\xDA');
+ // \xFA is invalid as initial char in UTF-8
assert(t.translate_nocase('\xFA') == '\xFA');
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 871 bytes
Desc: not available
More information about the libcxx-commits