<div dir="ltr"><br><div class="gmail_extra">Thanks.</div><div class="gmail_extra">Some comments below:</div><div class="gmail_extra"><br><div class="gmail_quote">2015-05-13 11:37 GMT-07:00 Duncan P. N. Exon Smith <span dir="ltr"><<a href="mailto:dexonsmith@apple.com" target="_blank">dexonsmith@apple.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class=""><div class="h5"><br>
> On 2015 May 13, at 09:53, Alex Lorenz <<a href="mailto:arphaman@gmail.com">arphaman@gmail.com</a>> wrote:<br>
><br>
> This update uses a new helper function that consumes a line break in another part of code that I forgot to update last time.<br>
><br>
><br>
> REPOSITORY<br>
> rL LLVM<br>
><br>
> <a href="http://reviews.llvm.org/D9503" target="_blank">http://reviews.llvm.org/D9503</a><br>
><br>
> Files:<br>
> include/llvm/Support/YAMLParser.h<br>
> lib/Support/YAMLParser.cpp<br>
> test/YAMLParser/spec-09-14.test<br>
> test/YAMLParser/spec-09-18.test<br>
> test/YAMLParser/spec-09-19.test<br>
> test/YAMLParser/spec-09-20.test<br>
> test/YAMLParser/spec-09-21.test<br>
> test/YAMLParser/spec-09-22.test<br>
> test/YAMLParser/spec-09-24.test<br>
> test/YAMLParser/spec-09-25.test<br>
> test/YAMLParser/spec-09-26.test<br>
> test/YAMLParser/spec-09-27.test<br>
> test/YAMLParser/spec-09-28.test<br>
> unittests/Support/YAMLParserTest.cpp<br>
> utils/yaml-bench/YAMLBench.cpp<br>
><br>
> EMAIL PREFERENCES<br>
> <a href="http://reviews.llvm.org/settings/panel/emailpreferences/" target="_blank">http://reviews.llvm.org/settings/panel/emailpreferences/</a><br>
</div></div>> <D9503.25702.patch><br>
<br>
It's a lot easier to see what's going on now; thanks.<br>
<br>
I wonder whether "returning IsDone" is the right thing for your<br>
helpers, or whether "returning IsError" would be cleaner. I also<br>
think in one spot you fail to return correctly on error (I pointed<br>
it out below somewhere).<br>
<br>
Many of the rest of the comments amount to: "skip `else` after a<br>
`return`".<br>
<div><div class="h5"><br>
> Index: lib/Support/YAMLParser.cpp<br>
> ===================================================================<br>
> --- lib/Support/YAMLParser.cpp<br>
> +++ lib/Support/YAMLParser.cpp<br>
> @@ -101,6 +101,7 @@<br>
> void Node::anchor() {}<br>
> void NullNode::anchor() {}<br>
> void ScalarNode::anchor() {}<br>
> +void BlockScalarNode::anchor() {}<br>
> void KeyValueNode::anchor() {}<br>
> void MappingNode::anchor() {}<br>
> void SequenceNode::anchor() {}<br>
> @@ -128,6 +129,7 @@<br>
> TK_Key,<br>
> TK_Value,<br>
> TK_Scalar,<br>
> + TK_BlockScalar,<br>
> TK_Alias,<br>
> TK_Anchor,<br>
> TK_Tag<br>
> @@ -137,6 +139,9 @@<br>
> /// of the token in the input.<br>
> StringRef Range;<br>
><br>
> + /// The value of a block scalar node.<br>
> + std::string Value;<br>
> +<br>
> Token() : Kind(TK_Error) {}<br>
> };<br>
> }<br>
> @@ -348,6 +353,14 @@<br>
> /// b-break.<br>
> StringRef::iterator skip_b_break(StringRef::iterator Position);<br>
><br>
> + /// @brief Skip a single s-space[31] starting at Position.<br>
> + ///<br>
> + /// An s-space is 0x20<br>
> + ///<br>
> + /// @returns The code unit after the s-space, or Position if it's not a<br>
> + /// s-space.<br>
> + StringRef::iterator skip_s_space(StringRef::iterator Position);<br>
> +<br>
> /// @brief Skip a single s-white[33] starting at Position.<br>
> ///<br>
> /// A s-white is 0x20 | 0x9<br>
</div></div>> @@ -373,6 +386,13 @@<br>
> StringRef::iterator skip_while( SkipWhileFunc Func<br>
> , StringRef::iterator Position);<br>
><br>
> + /// @brief Skip minimal well-formed code unit subsequences until Func<br>
> + /// returns its input.<br>
> + ///<br>
> + /// @returns The code unit after the last minimal well-formed code unit<br>
> + /// subsequence that Func accepted.<br>
> + void advanceWhile(SkipWhileFunc Func);<br>
> +<br>
> /// @brief Scan ns-uri-char[39]s starting at Cur.<br>
> ///<br>
> /// This updates Cur and Column while scanning.<br>
> @@ -393,6 +413,11 @@<br>
> /// Pos is whitespace or a new line<br>
> bool isBlankOrBreak(StringRef::iterator Position);<br>
><br>
> + /// @brief Consume a single b-break[28] if it's present at the current<br>
> + /// position. Return false if the code unit at the current position ins't a<br>
> + /// line break.<br>
> + bool consumeLineBreakIfPresent();<br>
> +<br>
> /// @brief If IsSimpleKeyAllowed, create and push_back a new SimpleKey.<br>
> void saveSimpleKeyCandidate( TokenQueueT::iterator Tok<br>
> , unsigned AtColumn<br>
> @@ -466,6 +491,26 @@<br>
> /// @brief Scan a block scalar starting with | or >.<br>
> bool scanBlockScalar(bool IsLiteral);<br>
><br>
> + /// @brief Scan a chomping indicator in a block scalar header.<br>
> + char scanBlockChompingIndicator();<br>
> +<br>
> + /// @brief Scan an indentation indicator in a block scalar header.<br>
> + unsigned scanBlockIndentationIndicator();<br>
> +<br>
> + /// @brief Scan a block scalar header.<br>
> + /// Return true if we've reached the end of the block scalar.<br>
> + bool scanBlockScalarHeader(char &ChompingIndicator, unsigned &IndentIndicator,<br>
> + bool &WasScanningSuccessful);<br>
> +<br>
> + /// @brief Look for the indentation level of a block scalar.<br>
> + /// Return true if we've reached the end of the block scalar.<br>
> + bool findBlockScalarIndent(unsigned &BlockIndent, unsigned BlockExitIndent,<br>
> + unsigned &LineBreaks);<br>
> +<br>
> + /// @brief Scan the indentation of a text line in a block scalar.<br>
> + /// Return true if we've reached the end of the block scalar.<br>
> + bool scanBlockScalarIndent(unsigned BlockIndent, unsigned BlockExitIndent);<br>
> +<br>
> /// @brief Scan a tag of the form !stuff.<br>
> bool scanTag();<br>
><br>
> @@ -612,6 +657,9 @@<br>
<span class="">> case Token::TK_Scalar:<br>
> OS << "Scalar: ";<br>
> break;<br>
> + case Token::TK_BlockScalar:<br>
> + OS << "Block Scalar: ";<br>
> + break;<br>
> case Token::TK_Alias:<br>
> OS << "Alias: ";<br>
> break;<br>
</span>> @@ -816,6 +864,13 @@<br>
<span class="">> return Position;<br>
> }<br>
><br>
> +StringRef::iterator Scanner::skip_s_space(StringRef::iterator Position) {<br>
> + if (Position == End)<br>
> + return Position;<br>
> + if (*Position == ' ')<br>
> + return Position + 1;<br>
> + return Position;<br>
> +}<br>
><br>
> StringRef::iterator Scanner::skip_s_white(StringRef::iterator Position) {<br>
> if (Position == End)<br>
</span>> @@ -844,6 +899,12 @@<br>
> return Position;<br>
> }<br>
><br>
> +void Scanner::advanceWhile(SkipWhileFunc Func) {<br>
> + auto Final = skip_while(Func, Current);<br>
> + Column += Final - Current;<br>
> + Current = Final;<br>
> +}<br>
> +<br>
> static bool is_ns_hex_digit(const char C) {<br>
> return (C >= '0' && C <= '9')<br>
> || (C >= 'a' && C <= 'z')<br>
> @@ -906,6 +967,16 @@<br>
> return false;<br>
> }<br>
><br>
> +bool Scanner::consumeLineBreakIfPresent() {<br>
> + auto Next = skip_b_break(Current);<br>
> + if (Next == Current)<br>
> + return false;<br>
<span class="">> + Column = 0;<br>
> + ++Line;<br>
</span>> + Current = Next;<br>
<span class="">> + return true;<br>
> +}<br>
> +<br>
</span>> void Scanner::saveSimpleKeyCandidate( TokenQueueT::iterator Tok<br>
> , unsigned AtColumn<br>
> , bool IsRequired) {<br>
> @@ -1374,38 +1445,220 @@<br>
> return true;<br>
> }<br>
><br>
> -bool Scanner::scanBlockScalar(bool IsLiteral) {<br>
<span class="">> - StringRef::iterator Start = Current;<br>
</span><span class="">> - skip(1); // Eat | or ><br>
> - while(true) {<br>
> - StringRef::iterator i = skip_nb_char(Current);<br>
> - if (i == Current) {<br>
> - if (Column == 0)<br>
> - break;<br>
> - i = skip_b_break(Current);<br>
> - if (i != Current) {<br>
> - // We got a line break.<br>
> - Column = 0;<br>
> - ++Line;<br>
> - Current = i;<br>
> - continue;<br>
</span>> +char Scanner::scanBlockChompingIndicator() {<br>
> + char Indicator = ' ';<br>
<span class="">> + if (Current != End && (*Current == '+' || *Current == '-')) {<br>
</span>> + Indicator = *Current;<br>
> + skip(1);<br>
> + }<br>
> + return Indicator;<br>
> +}<br>
> +<br>
> +/// \brief Return the number of trailing line breaks that have to be emitted<br>
> +/// as determined by the chomping indicator.<br>
<br>
Summarize the function in one line. If that's not enough, add more<br>
after a blank line. You don't need the `\brief` anymore, since we've<br>
turned on auto-brief. Something like this:<br>
<br>
/// Get the number of line breaks after chomping.<br>
///<br>
/// Return the number of trailing line breaks to emit, depending on<br>
/// \c ChompingIndicator.<br>
<br>
Please adjust the rest of your doxygen comments similarly.<br></blockquote><div><br></div><div>Fixed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> +static unsigned applyChompingIndicator(char ChompingIndicator,<br>
<br>
I think `getChompedLineBreaks()` is a better name.<br></blockquote><div><br></div><div>Done.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + unsigned LineBreaks, StringRef Str) {<br>
<br>
I think your comments should be more succinct. The code would be easier<br>
to read.</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span class=""><br>
> + // Don't output any trailing new lines if the stripping chomping behaviour is<br>
> + // specified.<br>
<br>
</span>If you move the comment inside the `if`, you don't need to describe the<br>
condition. This could just be a comment on the same line as the<br>
`return` (or the line before) that said something like "// Strip all<br>
line breaks.".<br></blockquote><div><br></div><div>Fixed.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span class=""><br>
> + if (ChompingIndicator == '-')<br>
</span>> + return 0;<br>
> + // Keep trailing lines when the keep chomping behaviour is specified.<br>
<br>
Similarly, this comment could be "// Keep all line breaks.".<br>
<br>
> + else if (ChompingIndicator == '+')<br>
<br>
This should be just `if`.<br>
<br>
> + return LineBreaks;<br>
<span class="">> + // Clip trailing lines (default chomping behaviour). The final line break<br>
> + // is preserved as long as the string isn't empty, but the other trailing<br>
> + // lines aren't kept.<br>
<br>
</span>And this could be "Clip trailing lines."<br>
<br>
> + return Str.empty() ? 0 : 1;<br>
<br>
Resulting in something like:<br>
<br>
if (ChompingIndicator == '-') // Strip all line breaks.<br>
return 0;<br>
<br>
if (ChompingIndicator == '+') // Keep all line breaks.<br>
return LineBreaks;<br>
<br>
// Clip trailing lines.<br>
return Str.empty() ? 0 : 1;<br></blockquote><div><br></div><div>Done.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> +}<br>
> +<br>
> +unsigned Scanner::scanBlockIndentationIndicator() {<br>
> + unsigned Indent = 0;<br>
<span class="">> + if (Current != End && (*Current >= '1' && *Current <= '9')) {<br>
</span>> + Indent = unsigned(*Current - '0');<br>
> + skip(1);<br>
> + }<br>
> + return Indent;<br>
> +}<br>
> +<br>
> +bool Scanner::scanBlockScalarHeader(char &ChompingIndicator,<br>
> + unsigned &IndentIndicator,<br>
> + bool &WasScanningSuccessful) {<br>
<span class="">> + // Chomping indicator ([164] c-chomping-indicator)<br>
<br>
</span>Remove this comment. Add this to the docs for<br>
`scanBlockChompingIndicator()` if it's useful.<br></blockquote><div><br></div><div>Fixed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + ChompingIndicator = scanBlockChompingIndicator();<br>
<span class="">> + // Indentation indicator ([163] c-indentation-indicator)<br>
<br>
</span>Remove this comment. Add this to the docs for<br>
`scanBlockIndentationIndicator()` if it's useful.<br></blockquote><div><br></div><div>Fixed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + IndentIndicator = scanBlockIndentationIndicator();<br>
> + // Check for the chomping indicator once again.<br>
> + if (ChompingIndicator == ' ')<br>
> + ChompingIndicator = scanBlockChompingIndicator();<br>
<span class="">> + Current = skip_while(&Scanner::skip_s_white, Current);<br>
> + skipComment();<br>
</span><span class="">> +<br>
> + if (Current == End) {<br>
> + // End of file after block header, we have an empty scalar.<br>
</span>> + WasScanningSuccessful = true;<br>
> + return true;<br>
> + }<br>
<span class="">> + auto I = skip_b_break(Current);<br>
> + if (I == Current) {<br>
> + setError("Expected a line break after block scalar header", Current);<br>
</span>> + WasScanningSuccessful = false;<br>
> + return true;<br>
> + }<br>
<span class="">> + Current = I;<br>
> + Column = 0;<br>
> + ++Line;<br>
</span>> + return false;<br>
> +}<br>
> +<br>
> +bool Scanner::findBlockScalarIndent(unsigned &BlockIndent,<br>
> + unsigned BlockExitIndent,<br>
> + unsigned &LineBreaks) {<br>
<span class="">> + unsigned MaxAllSpaceLineCharacters = 0;<br>
> + StringRef::iterator LongestAllSpaceLine;<br>
> +<br>
</span>> + while (true) {<br>
> + advanceWhile(&Scanner::skip_s_space);<br>
<span class="">> + // If there is a non breaking character next, then this is a non empty<br>
> + // line, thus we can use the discovered indentation as the block's<br>
> + // indentation.<br>
<br>
</span>I'd move this...<br>
<span class=""><br>
> + if (skip_nb_char(Current) != Current) {<br>
<br>
</span>... to here:<br>
<br>
// This line isn't empty, so find the indentation.<br></blockquote><div><br></div><div>Done.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span class=""><br>
> + if (Column <= BlockExitIndent) {<br>
> + // This is the end of the block literal<br>
</span>> + return true;<br>
> } else {<br>
<br>
Remove the `else`.<br></blockquote><div><br></div><div>Fixed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span class=""><br>
> - // There was an error, which should already have been printed out.<br>
> - return false;<br>
</span>> + BlockIndent = Column;<br>
> + // So here we enforce the rule that the leading all space lines can't<br>
<span class="">> + // have more characters than the block's indentation level<br>
<br>
</span>Is this comment useful? It looks to me like it's obvious from the error<br>
message.<br></blockquote><div><br></div><div>Removed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span class=""><br>
> + if (MaxAllSpaceLineCharacters > BlockIndent) {<br>
> + setError("A leading all-space line must not have too many spaces",<br>
<br>
</span>Do you have a test for this error?<br></blockquote><div><br></div><div><span style="color:rgb(0,0,0);font-family:-apple-system-font;font-size:12px;line-height:16px">Yes, it’s tested in the YAML parser unit tests.</span><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
"too many" is vague. I suggest something more like:<br>
<br>
"Leading all-space lines must be smaller than the block indent"<br></blockquote><div><br></div><div>Fixed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + LongestAllSpaceLine);<br>
> + }<br>
> + // We found the block's indentation.<br>
> + break;<br>
<br>
This should just be `return false`.</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span class=""><br>
> + }<br>
> + } else if (skip_b_break(Current) != Current) {<br>
<br>
</span>Remove the `else`.<br></blockquote><div> </div><div>Fixed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span class=""><br>
> + // This is an all space line, so we have to record the amount of<br>
> + // space characters it has so that later when we find the first<br>
> + // text line that defines the indentation level we can make sure<br>
> + // that all previous all space lines don't have more space<br>
> + // characters than the indentation level.<br>
<br>
</span>This is long and inhibits legibility of the code...<br>
<br>
> + if (Column > MaxAllSpaceLineCharacters) {<br>
<br>
... so move it to here and skip the stuff that's obvious from reading<br>
the code.<br>
<br>
// Record the longest all-space line in case it's longer than the<br>
// block indent.<br></blockquote><div><br></div><div>Done.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span class=""><br>
> + MaxAllSpaceLineCharacters = Column;<br>
> + LongestAllSpaceLine = Current;<br>
> }<br>
> }<br>
</span>> - Current = i;<br>
<span class="">> +<br>
> + // Check for EOF.<br>
> + if (Current == End)<br>
</span>> + return true;<br>
> +<br>
> + if (!consumeLineBreakIfPresent())<br>
> + return true;<br>
> + ++LineBreaks;<br>
> + }<br>
<span class="">> + return false;<br>
> +}<br>
> +<br>
</span>> +bool Scanner::scanBlockScalarIndent(unsigned BlockIndent,<br>
> + unsigned BlockExitIndent) {<br>
> + // Skip the indentation.<br>
<span class="">> + while (Column < BlockIndent) {<br>
> + auto I = skip_s_space(Current);<br>
> + if (I == Current)<br>
> + break;<br>
> + Current = I;<br>
</span>> ++Column;<br>
<span class="">> }<br>
><br>
> - if (Start == Current) {<br>
> - setError("Got empty block scalar", Start);<br>
</span>> - return false;<br>
<span class="">> + // If this line isn't empty then we have to check the indentation to<br>
> + // see if the block scalar ends.<br>
> + if (skip_nb_char(Current) != Current) {<br>
> + if (Column <= BlockExitIndent) {<br>
> + // This is the end of the block literal<br>
</span>> + return true;<br>
<span class="">> + } else if (Column < BlockIndent) {<br>
<br>
</span>Remove the `else`.<br></blockquote><div><br></div><div>Fixed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span class=""><br>
> + if (Current != End && *Current == '#') {<br>
> + // This is a trailing comment<br>
</span>> + return true;<br>
<br>
What about spaces followed by a comment? Is this duplicated from your<br>
other comment-handling code?<br></blockquote><div><br></div><div><div>I’m not sure what you mean here. The leading all-space lines treat comments</div><div>like content, this doesn’t come up there.</div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + } else {<br>
<br>
Remove the `else`.<br></blockquote><div><br></div><div>Fixed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + setError("A text line is less indented than the block scalar", Current);<br>
<br>
Aren't you returning "IsDone"? Should you return `true` here?<br></blockquote><div><br></div><div>I'm returning false for an error now.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + }<br>
> + }<br>
> + }<br>
> + return false;<br>
<br>
If you invert the outer condition, you can return early, and this'll all<br>
be easier to read:<br>
<br>
if (skip_nb_char(Current) == Current)<br>
return false;<br>
<br>
if (Column <= BlockExitIndent) // End of the block literal.<br>
return true;<br>
<br>
if (Curent != End && *Current == '#') // Trailing comment.<br>
return true;<br>
<br>
setError(...);<br>
return false;<br>
<br>
This logic matches what you have (I think!), but it seems kind of off.<br>
In particular, I'm not sure the error makes sense -- it seems to<br>
describe the "Column <= BlockExitIndent" scenario. Once you've<br>
refactored this, please review the logic.<br></blockquote><div><br></div><div>Refactored.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> +}<br>
> +<br>
> +bool Scanner::scanBlockScalar(bool IsLiteral) {<br>
<span class="">> + auto Start = Current;<br>
</span>> + // Eat '|' or '>'<br>
> + assert(*Current == '|' || *Current == '>');<br>
> + skip(1);<br>
> +<br>
> + // Block Header ([162] c-b-block-header)<br>
<br>
This comment should be part of `scanBlockScalarHeader()`, if anywhere.<br></blockquote><div><br></div><div>Removed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + char ChompingIndicator;<br>
> + unsigned BlockIndent;<br>
> + bool WasScanningSuccessful;<br>
> + bool Done = scanBlockScalarHeader(ChompingIndicator, BlockIndent,<br>
> + WasScanningSuccessful);<br>
<br>
Usually we use returns from parsing functions as an error mechanism<br>
(see, e.g., `LLParser.cpp`). Should you really use it for Done-ness?<br>
If you use it for errors, you get code like this:<br>
<br>
if (!scanBlockScalarHeader(...))<br>
return false;<br>
<br>
Play with it a bit, and see if it looks cleaner (you'd need to change<br>
the rest of your functions as well). I don't feel strongly either way.<br></blockquote><div><br></div><div>That would be good for consistency, I use the error return now in the</div><div>updated patch.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + if (Done) {<br>
> + if (WasScanningSuccessful) {<br>
<span class="">> + Token T;<br>
> + T.Kind = Token::TK_BlockScalar;<br>
</span>> + T.Range = StringRef(Start, Current - Start);<br>
> + TokenQueue.push_back(T);<br>
<br>
Would this logic fit inside `scanBlockScalarHeader()`?<br></blockquote><div><br></div><div>Yes, done.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + }<br>
> + return WasScanningSuccessful;<br>
> + }<br>
> + Start = Current;<br>
> +<br>
> + // Look for a first non-blank line to determine the indenation of the block<br>
> + // scalar.<br>
<br>
Not a useful comment. The name of the function you're calling explains<br>
this.<br></blockquote><div><br></div><div>Removed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span class=""><br>
> + unsigned BlockExitIndent = Indent < 0 ? 0 : (unsigned)Indent;<br>
> + unsigned LineBreaks = 0;<br>
</span>> + if (BlockIndent == 0)<br>
> + Done = findBlockScalarIndent(BlockIndent, BlockExitIndent, LineBreaks);<br>
> +<br>
> + // Scan the block's scalars body.<br>
> + SmallString<256> Str;<br>
> + while (!Done) {<br>
<span class="">> + // Handle the block's indentation<br>
<br>
</span>The function name below is sufficient. This comment isn't useful.<br></blockquote><div><br></div><div>Removed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + if (scanBlockScalarIndent(BlockIndent, BlockExitIndent))<br>
<br>
What about errors from this?<br></blockquote><div><br></div><div>Fixed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + break;<br>
<span class="">> +<br>
> + // Parse the current line.<br>
</span>> + auto LineStart = Current;<br>
> + advanceWhile(&Scanner::skip_nb_char);<br>
> + if (LineStart != Current) {<br>
<span class="">> + for (unsigned I = 0; I < LineBreaks; ++I)<br>
</span>> + Str.push_back('\n');<br>
<br>
This loop is the same as:<br>
<br>
Str.append(LineBreaks, '\n');<br></blockquote><div><br></div><div>Fixed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + LineBreaks = 0;<br>
> + Str.append(StringRef(LineStart, Current - LineStart));<br>
<br>
I think it'll be easier to read if you put the two `Str`-modifying<br>
operations next to each other, and then reset `LineBreaks`.<br></blockquote><div><br></div><div>Done.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + }<br>
<span class="">> +<br>
> + // Check for EOF.<br>
> + if (Current == End)<br>
</span>> + break;<br>
> +<br>
> + if (!consumeLineBreakIfPresent())<br>
> + break;<br>
> + ++LineBreaks;<br>
> }<br>
><br>
> + if (Current == End && !LineBreaks)<br>
<span class="">> + // Ensure that there is at least one line break before the end of file.<br>
</span>> + LineBreaks = 1;<br>
<br>
Is this tested somewhere? Does it interact correctly with the<br>
chomping style?<br></blockquote><div><br></div><div><span style="color:rgb(0,0,0);font-family:-apple-system-font;font-size:12px;line-height:16px">Yes - it is tested at the 'YAMLParser/spec-09-24.test’ test.</span><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + LineBreaks = applyChompingIndicator(ChompingIndicator, LineBreaks, Str);<br>
<span class="">> + for (unsigned I = 0; I < LineBreaks; ++I)<br>
</span>> + Str.push_back('\n');<br>
<br>
Another case of:<br>
<br>
Str.append(LineBreaks, '\n');<br>
<br>
With the rename I suggested for this function, I think this would read<br>
well with both lines merged:<br>
<br>
Str.append(getChompedLineBreaks(ChompingIndicator, LineBreaks, Str),<br>
'\n');<br>
<br>
Up to you though.<br></blockquote><div><br></div><div>Fixed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span class=""><br>
> +<br>
> + // New lines may start a simple key.<br>
> + if (!FlowLevel)<br>
> + IsSimpleKeyAllowed = true;<br>
<br>
</span>Is this tested somewhere? Can it be in a separate commit?<br></blockquote><div><br></div><div><div style="color:rgb(0,0,0);font-family:-apple-system-font;font-size:12px;line-height:16px">Yes - it is tested by several tests. I would really prefer this to be in this</div><div style="color:rgb(0,0,0);font-family:-apple-system-font;font-size:12px;line-height:16px">commit as It can’t be tested if I were to commit it before and</div><div style="color:rgb(0,0,0);font-family:-apple-system-font;font-size:12px;line-height:16px">after it would break most tests in this one. Those tests are derived from</div><div style="color:rgb(0,0,0);font-family:-apple-system-font;font-size:12px;line-height:16px">PyYAML and I think it’s important to try and keep the compatibility.</div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span class=""><br>
> +<br>
> Token T;<br>
> - T.Kind = Token::TK_Scalar;<br>
</span><span class="">> + T.Kind = Token::TK_BlockScalar;<br>
</span><span class="">> T.Range = StringRef(Start, Current - Start);<br>
</span>> + T.Value = std::move(Str.str().str());<br>
<br>
The `std::move()` isn't useful -- this is already returning a temporary.<br></blockquote><div><br></div><div>Fixed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> TokenQueue.push_back(T);<br>
> return true;<br>
> }<br>
> @@ -1607,6 +1860,7 @@<br>
<span class="">> case NK_Null:<br>
> return "tag:<a href="http://yaml.org" target="_blank">yaml.org</a>,2002:null";<br>
> case NK_Scalar:<br>
> + case NK_BlockScalar:<br>
> // TODO: Tag resolution.<br>
> return "tag:<a href="http://yaml.org" target="_blank">yaml.org</a>,2002:str";<br>
<br>
</span>Is this change tested somewhere?<br></blockquote><div> </div><div><span style="color:rgb(0,0,0);font-family:-apple-system-font;font-size:12px;line-height:16px">Yes - all the file check tests check the !!str tag, and yaml-bench derives this tag from this string.</span><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> case NK_Mapping:<br>
> @@ -2138,6 +2392,11 @@<br>
<div class=""><div class="h5">> , AnchorInfo.Range.substr(1)<br>
> , TagInfo.Range<br>
> , T.Range);<br>
> + case Token::TK_BlockScalar:<br>
> + getNext();<br>
> + return new (NodeAllocator)<br>
> + BlockScalarNode(stream.CurrentDoc, AnchorInfo.Range.substr(1),<br>
> + TagInfo.Range, T.Value, T.Range);<br>
> case Token::TK_Key:<br>
> // Don't eat the TK_Key, KeyValueNode expects it.<br>
> return new (NodeAllocator)<br>
<br>
</div></div></blockquote></div><br></div></div>