[libc-commits] [libc] 1cbd25f - [NFC][libc] Clarifies underscores in n-char-sequence. (#102193)
via libc-commits
libc-commits at lists.llvm.org
Mon Aug 12 09:59:30 PDT 2024
Author: Mark de Wever
Date: 2024-08-12T18:59:24+02:00
New Revision: 1cbd25f882d10de1a23bb0287a70cde5037ebf42
URL: https://github.com/llvm/llvm-project/commit/1cbd25f882d10de1a23bb0287a70cde5037ebf42
DIFF: https://github.com/llvm/llvm-project/commit/1cbd25f882d10de1a23bb0287a70cde5037ebf42.diff
LOG: [NFC][libc] Clarifies underscores in n-char-sequence. (#102193)
The C standard specifies
n-char-sequence:
digit
nondigit
n-char-sequence digit
n-char-sequence nondigit
nondigit is specified as one of:
_ a b c d e f g h i j k l m
n o p q r s t u v w x y z
A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z
This means nondigit includes the underscore character. This patch
clarifies this status in the comments and the test.
Note C17 specifies n-char-sequence for NaN() as optional, and an empty
sequence is not a valid n-char-sequence. However the current comment has
the same effect as using the pedantic wording. So I left that part
unchanged.
Added:
Modified:
libc/src/__support/str_to_float.h
libc/test/src/stdlib/strtof_test.cpp
Removed:
################################################################################
diff --git a/libc/src/__support/str_to_float.h b/libc/src/__support/str_to_float.h
index 3bcfc190026257..ffd6ebf27c7726 100644
--- a/libc/src/__support/str_to_float.h
+++ b/libc/src/__support/str_to_float.h
@@ -1159,13 +1159,11 @@ LIBC_INLINE StrToNumResult<T> strtofloatingpoint(const char *__restrict src) {
index += 3;
StorageType nan_mantissa = 0;
// this handles the case of `NaN(n-character-sequence)`, where the
- // n-character-sequence is made of 0 or more letters and numbers in any
- // order.
+ // n-character-sequence is made of 0 or more letters, numbers, or
+ // underscore characters in any order.
if (src[index] == '(') {
size_t left_paren = index;
++index;
- // Apparently it's common for underscores to also be accepted. No idea
- // why, but it's causing fuzz failures.
while (isalnum(src[index]) || src[index] == '_')
++index;
if (src[index] == ')') {
diff --git a/libc/test/src/stdlib/strtof_test.cpp b/libc/test/src/stdlib/strtof_test.cpp
index d7991745b69e6c..6a716c956291cc 100644
--- a/libc/test/src/stdlib/strtof_test.cpp
+++ b/libc/test/src/stdlib/strtof_test.cpp
@@ -200,7 +200,7 @@ TEST_F(LlvmLibcStrToFTest, NaNWithParenthesesValidSequenceInvalidNumberTests) {
run_test("NaN(1a)", 7, 0x7fc00000);
run_test("NaN(asdf)", 9, 0x7fc00000);
run_test("NaN(1A1)", 8, 0x7fc00000);
- run_test("NaN(why_does_this_work)", 23, 0x7fc00000);
+ run_test("NaN(underscores_are_ok)", 23, 0x7fc00000);
run_test(
"NaN(1234567890qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM_)",
68, 0x7fc00000);
More information about the libc-commits
mailing list