[llvm-bugs] [Bug 40904] New: regex_search on MacOS gives wrong results when \D found in a character class
via llvm-bugs
llvm-bugs at lists.llvm.org
Thu Feb 28 08:33:12 PST 2019
https://bugs.llvm.org/show_bug.cgi?id=40904
Bug ID: 40904
Summary: regex_search on MacOS gives wrong results when \D
found in a character class
Product: libc++
Version: unspecified
Hardware: Macintosh
OS: All
Status: NEW
Severity: normal
Priority: P
Component: All Bugs
Assignee: unassignedclangbugs at nondot.org
Reporter: tom at kera.name
CC: llvm-bugs at lists.llvm.org, mclow.lists at gmail.com
Pre-C++20, there's no way to turn on /s, so instead of a pattern like /ab.cd/
(where the third character could be a newline) we must write something like
/ab[/d/D]cd/ (using the union of "digits" and "non-digits" to match "any
character").
Unfortunately, libc++ doesn't match properly on this.
Example:
#include <regex>
#include <string>
#include <iostream>
#include <iomanip>
int main()
{
const std::string input = "abZcd";
char const* pattern = R"REGEX(^ab[\d\D]cd)REGEX";
std::regex::flag_type flags = std::regex_constants::ECMAScript;
std::regex re(pattern, flags);
std::cout << std::boolalpha << std::regex_search(input.cbegin(),
input.cend(), re) << '\n';
}
Output is "false" with:
$ clang --version
Apple LLVM version 10.0.0 (clang-1000.10.44.4)
Target: x86_64-apple-darwin18.2.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
But "true" (as expected) with g++ (GCC) 8.2.0.
Looking into it a bit, here are the results with some variants:
Pattern Input Should match? Matches?
-------------------------------------------------
/^ab[\d\D]cd/ abZcd Yes No <--- !
/^ab[\d\D]cd/ ab5cd Yes No <--- !
/^ab[\D]cd/ abZcd Yes No <--- !
/^ab\Dcd/ abZcd Yes Yes
/^ab[\d]cd/ ab5cd Yes Yes
/^ab\dcd/ ab5cd Yes Yes
/^ab\dcd/ abZcd No No
/^ab\Dcd/ ab5cd No No
The common feature amongst the three failures is the \D inside a character
class.
The behaviour is the same when switching to std::regex_match.
For added fun, I get the expected results on Linux:
$ clang++ --version
clang version 5.0.0-3~16.04.1 (tags/RELEASE_500/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Related to bug 21363 (locale fun)?
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190228/fa9e57f7/attachment.html>
More information about the llvm-bugs
mailing list