[llvm] 32b3f13 - [YAML] Trim trailing whitespace from plain scalars
Rahul Kayaith via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 9 18:57:02 PST 2023
Author: rkayaith
Date: 2023-02-09T21:56:57-05:00
New Revision: 32b3f13337ef0bf747705d058f4772c7fdabd736
URL: https://github.com/llvm/llvm-project/commit/32b3f13337ef0bf747705d058f4772c7fdabd736
DIFF: https://github.com/llvm/llvm-project/commit/32b3f13337ef0bf747705d058f4772c7fdabd736.diff
LOG: [YAML] Trim trailing whitespace from plain scalars
In some cases plain scalars are currently parsed with a trailing
newline. In particular this shows up often when parsing JSON files, e.g.
note the `\n` after `456` below:
```
$ cat test.yaml
{
"foo": 123,
"bar": 456
}
$ yaml-bench test.yaml -canonical
%YAML 1.2
---
!!map {
? !!str "foo"
: !!str "123",
? !!str "bar"
: !!str "456\n",
}
...
```
The trailing whitespace ends up causing the conversion of the scalar to
int/bool/etc. to fail, causing the issue seen here:
https://github.com/llvm/llvm-project/issues/15877
>From reading the YAML spec (https://yaml.org/spec/1.2.2/#733-plain-style)
it seems like plain scalars should never end with whitespace, so this
change trims all trailing whitespace characters from the
value (specifically `b-line-feed`, `b-carriage-return`, `s-space`, and
`s-tab`).
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D137118
Added:
llvm/test/YAMLParser/json.test
Modified:
llvm/lib/Support/YAMLParser.cpp
llvm/unittests/Support/YAMLIOTest.cpp
Removed:
################################################################################
diff --git a/llvm/lib/Support/YAMLParser.cpp b/llvm/lib/Support/YAMLParser.cpp
index b85b1eb83ef89..6ac2c6aeeb46a 100644
--- a/llvm/lib/Support/YAMLParser.cpp
+++ b/llvm/lib/Support/YAMLParser.cpp
@@ -2041,8 +2041,11 @@ StringRef ScalarNode::getValue(SmallVectorImpl<char> &Storage) const {
}
return UnquotedValue;
}
- // Plain or block.
- return Value.rtrim(' ');
+ // Plain.
+ // Trim whitespace ('b-char' and 's-white').
+ // NOTE: Alternatively we could change the scanner to not include whitespace
+ // here in the first place.
+ return Value.rtrim("\x0A\x0D\x20\x09");
}
StringRef ScalarNode::unescapeDoubleQuoted( StringRef UnquotedValue
diff --git a/llvm/test/YAMLParser/json.test b/llvm/test/YAMLParser/json.test
new file mode 100644
index 0000000000000..7d1b24caed987
--- /dev/null
+++ b/llvm/test/YAMLParser/json.test
@@ -0,0 +1,13 @@
+# RUN: yaml-bench -canonical %s | FileCheck %s
+
+# CHECK: !!map {
+# CHECK: ? !!str "foo"
+# CHECK: : !!str "123",
+# CHECK: ? !!str "bar"
+# CHECK: : !!str "456",
+# CHECK: }
+
+{
+ "foo": 123,
+ "bar": 456
+}
diff --git a/llvm/unittests/Support/YAMLIOTest.cpp b/llvm/unittests/Support/YAMLIOTest.cpp
index 2ed79cae31edc..f282d23dc500b 100644
--- a/llvm/unittests/Support/YAMLIOTest.cpp
+++ b/llvm/unittests/Support/YAMLIOTest.cpp
@@ -96,6 +96,15 @@ TEST(YAMLIO, TestMapRead) {
EXPECT_EQ(doc.foo, 3);
EXPECT_EQ(doc.bar, 5);
}
+
+ {
+ Input yin("{\"foo\": 3\n, \"bar\": 5}");
+ yin >> doc;
+
+ EXPECT_FALSE(yin.error());
+ EXPECT_EQ(doc.foo, 3);
+ EXPECT_EQ(doc.bar, 5);
+ }
}
TEST(YAMLIO, TestMalformedMapRead) {
More information about the llvm-commits
mailing list