[PATCH] Implement TokenizeWindowsCommandLine.
Reid Kleckner
rnk at google.com
Tue Jul 30 10:41:56 PDT 2013
LGTM
Thanks! I think this'll be pretty handy for various LLVM tools.
================
Comment at: lib/Support/CommandLine.cpp:504
@@ +503,3 @@
+/// escape double quote. This method consumes runs of backslashes as well as the
+/// following backslash if it's escaped.
+///
----------------
Did you mean the following quote?
================
Comment at: lib/Support/CommandLine.cpp:509
@@ +508,3 @@
+/// quote remains unconsumed. The double quote will later be interpreted as
+/// the start of a quoted string in the main loop outside of this function.
+///
----------------
I guess I'd say "start or end" of a quoted string.
================
Comment at: lib/Support/CommandLine.cpp:516
@@ +515,3 @@
+///
+/// * Otherwise, backslahses are interpreted literally.
+static size_t parseBackslash(StringRef Src, size_t I, SmallString<128> &Token) {
----------------
typo on backslashes
================
Comment at: lib/Support/CommandLine.cpp:526
@@ +525,3 @@
+
+ bool followedByDoubleQuote = (I != E && Src[I] == '"');
+ if (followedByDoubleQuote) {
----------------
nit: variables are StudlyCaps.
================
Comment at: lib/Support/CommandLine.cpp:547
@@ +546,3 @@
+ // end of the source string.
+ enum { INIT, UNQUOTED, QUOTED } state = INIT;
+ for (size_t I = 0, E = Src.size(); I != E; ++I) {
----------------
nit: enums and the variable are StudlyCaps.
================
Comment at: lib/Support/CommandLine.cpp:536
@@ +535,3 @@
+
+ for (int J = 0; J < BackslashCount; ++J)
+ Token.push_back('\\');
----------------
This can be Token.append(BackslashCount, '\\');
================
Comment at: unittests/Support/CommandLineTest.cpp:156
@@ +155,3 @@
+TEST(CommandLineTest, TokenizeWindowsCommandLine) {
+ const char *Input = "a\\b c\\\\d e\\\\\"f g\" h\\\"i j\\\\\\\"k \"lmn\" o pqr "
+ "\"st \\\"u\" \\v";
----------------
Oof, reading this test case is hard, but I don't have any good suggestions. :)
================
Comment at: lib/Support/CommandLine.cpp:545
@@ +544,3 @@
+
+ // This is a small state machine to consume characters until it reaches the
+ // end of the source string.
----------------
Hm, this looks nicer than the way I wrote the GNU tokenizer. :)
================
Comment at: unittests/Support/CommandLineTest.cpp:137
@@ -138,1 +136,3 @@
+ parse(Input, Saver, Actual);
+ EXPECT_EQ(OutputSize, Actual.size());
for (unsigned I = 0, E = Actual.size(); I != E; ++I) {
----------------
Huh, I wasn't aware that EXPECT_EQ works in a free helper function.
http://llvm-reviews.chandlerc.com/D1235
More information about the llvm-commits
mailing list