[PATCH] D137118: [YAML] Trim trailing whitespace from plain scalars
Rahul Kayaith via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 31 19:14:53 PDT 2022
rkayaith created this revision.
Herald added a subscriber: hiraditya.
Herald added a project: All.
rkayaith added inline comments.
rkayaith published this revision for review.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.
================
Comment at: llvm/lib/Support/YAMLParser.cpp:2045-2046
// Plain or block.
- return Value.rtrim(' ');
+ // Trim whitespace ('b-char' and 's-white').
+ return Value.rtrim("\x0A\x0D\x20\x09");
}
----------------
An alternative to this might be to change the [[ https://github.com/llvm/llvm-project/blob/cc2cf8b22b35a9cf87a61bf66c259455185fa6e3/llvm/lib/Support/YAMLParser.cpp#L1427 | scanner ]] to match the spec more closely, but I'm not very familiar with the spec or that code so I couldn't tell if that would be straightforward to do.
In some cases plain scalars are currently parsed with a trailing newline. In
particular this shows up often when parsing JSON files, e.g. below note the `\n`
after `456`:
$ cat test.yaml
{
"foo": 123,
"bar": 456
}
$ yaml-bench test.yaml -canonical
%YAML 1.2
---
!!map {
? !!str "foo"
: !!str "123",
? !!str "bar"
: !!str "456\n",
}
...
The trailing whitespace ends up causing the conversion of the scalar to
int/bool/etc. to fail, and is inconsistent with the YAML spec:
https://yaml.org/spec/1.2.2/#733-plain-style
This change trims any trailing whitespace characters from the scalar value
(specifically `b-line-feed`, `b-carriage-return`, `s-space`, and `s-tab`).
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D137118
Files:
llvm/lib/Support/YAMLParser.cpp
llvm/test/YAMLParser/json.test
llvm/unittests/Support/YAMLIOTest.cpp
Index: llvm/unittests/Support/YAMLIOTest.cpp
===================================================================
--- llvm/unittests/Support/YAMLIOTest.cpp
+++ llvm/unittests/Support/YAMLIOTest.cpp
@@ -96,6 +96,15 @@
EXPECT_EQ(doc.foo, 3);
EXPECT_EQ(doc.bar, 5);
}
+
+ {
+ Input yin("{\"foo\": 3\n, \"bar\": 5}");
+ yin >> doc;
+
+ EXPECT_FALSE(yin.error());
+ EXPECT_EQ(doc.foo, 3);
+ EXPECT_EQ(doc.bar, 5);
+ }
}
TEST(YAMLIO, TestMalformedMapRead) {
Index: llvm/test/YAMLParser/json.test
===================================================================
--- /dev/null
+++ llvm/test/YAMLParser/json.test
@@ -0,0 +1,13 @@
+# RUN: yaml-bench -canonical %s | FileCheck %s
+
+# CHECK: !!map {
+# CHECK: ? !!str "foo"
+# CHECK: : !!str "123",
+# CHECK: ? !!str "bar"
+# CHECK: : !!str "456",
+# CHECK: }
+
+{
+ "foo": 123,
+ "bar": 456
+}
Index: llvm/lib/Support/YAMLParser.cpp
===================================================================
--- llvm/lib/Support/YAMLParser.cpp
+++ llvm/lib/Support/YAMLParser.cpp
@@ -2042,7 +2042,8 @@
return UnquotedValue;
}
// Plain or block.
- return Value.rtrim(' ');
+ // Trim whitespace ('b-char' and 's-white').
+ return Value.rtrim("\x0A\x0D\x20\x09");
}
StringRef ScalarNode::unescapeDoubleQuoted( StringRef UnquotedValue
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D137118.472148.patch
Type: text/x-patch
Size: 1328 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20221101/27756159/attachment.bin>
More information about the llvm-commits
mailing list