[PATCH] D55746: [libcxx] [test] [re.traits] Correct expected values for invalid UTF-8

Michał Górny via Phabricator reviews at reviews.llvm.org
Mon Dec 17 07:33:26 PST 2018


mgorny updated this revision to Diff 178463.
mgorny retitled this revision from "[libcxx] [test] [re.traits] Remove asserts failing due to invalid UTF-8" to "[libcxx] [test] [re.traits] Correct expected values for invalid UTF-8".
mgorny edited the summary of this revision.
mgorny added a comment.

Very well. I originally wanted to avoid relying on any specific behavior with invalid input but I suppose you're right. Unless I'm misunderstanding the spec, the behavior should be equivalent to `tolower()`, and `tolower()` specifies that the value should be returned unmodified if there's no lowercase representation. Now, I suppose it depends on how you define that but I think it's reasonable to assume that invalid characters have no lowercase representation.

I've tested this behavior on Linux, FreeBSD and NetBSD. I suppose the original change from behavior equivalent to my patch now to the broken behavior we have right now was accidental.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55746/new/

https://reviews.llvm.org/D55746

Files:
  test/std/re/re.traits/translate_nocase.pass.cpp


Index: test/std/re/re.traits/translate_nocase.pass.cpp
===================================================================
--- test/std/re/re.traits/translate_nocase.pass.cpp
+++ test/std/re/re.traits/translate_nocase.pass.cpp
@@ -19,9 +19,6 @@
 // XFAIL: with_system_cxx_lib=macosx10.7
 // XFAIL: with_system_cxx_lib=macosx10.8
 
-// TODO: investigation needed
-// XFAIL: linux-gnu
-
 #include <regex>
 #include <cassert>
 
@@ -47,7 +44,9 @@
         assert(t.translate_nocase('.') == '.');
         assert(t.translate_nocase('a') == 'a');
         assert(t.translate_nocase('1') == '1');
-        assert(t.translate_nocase('\xDA') == '\xFA');
+        // \xDA is initial char of MBS in UTF-8
+        assert(t.translate_nocase('\xDA') == '\xDA');
+        // \xFA is invalid as initial char in UTF-8
         assert(t.translate_nocase('\xFA') == '\xFA');
     }
     {


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D55746.178463.patch
Type: text/x-patch
Size: 871 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/libcxx-commits/attachments/20181217/d5507fba/attachment.bin>


More information about the libcxx-commits mailing list