[PATCH] Implement TokenizeWindowsCommandLine.

Reid Kleckner rnk at google.com
Tue Jul 30 10:41:56 PDT 2013


  LGTM

  Thanks!  I think this'll be pretty handy for various LLVM tools.


================
Comment at: lib/Support/CommandLine.cpp:504
@@ +503,3 @@
+/// escape double quote. This method consumes runs of backslashes as well as the
+/// following backslash if it's escaped.
+///
----------------
Did you mean the following quote?

================
Comment at: lib/Support/CommandLine.cpp:509
@@ +508,3 @@
+///    quote remains unconsumed. The double quote will later be interpreted as
+///    the start of a quoted string in the main loop outside of this function.
+///
----------------
I guess I'd say "start or end" of a quoted string.

================
Comment at: lib/Support/CommandLine.cpp:516
@@ +515,3 @@
+///
+///  * Otherwise, backslahses are interpreted literally.
+static size_t parseBackslash(StringRef Src, size_t I, SmallString<128> &Token) {
----------------
typo on backslashes

================
Comment at: lib/Support/CommandLine.cpp:526
@@ +525,3 @@
+
+  bool followedByDoubleQuote = (I != E && Src[I] == '"');
+  if (followedByDoubleQuote) {
----------------
nit: variables are StudlyCaps.

================
Comment at: lib/Support/CommandLine.cpp:547
@@ +546,3 @@
+  // end of the source string.
+  enum { INIT, UNQUOTED, QUOTED } state = INIT;
+  for (size_t I = 0, E = Src.size(); I != E; ++I) {
----------------
nit: enums and the variable are StudlyCaps.

================
Comment at: lib/Support/CommandLine.cpp:536
@@ +535,3 @@
+
+  for (int J = 0; J < BackslashCount; ++J)
+    Token.push_back('\\');
----------------
This can be Token.append(BackslashCount, '\\');

================
Comment at: unittests/Support/CommandLineTest.cpp:156
@@ +155,3 @@
+TEST(CommandLineTest, TokenizeWindowsCommandLine) {
+  const char *Input = "a\\b c\\\\d e\\\\\"f g\" h\\\"i j\\\\\\\"k \"lmn\" o pqr "
+                      "\"st \\\"u\" \\v";
----------------
Oof, reading this test case is hard, but I don't have any good suggestions.  :)

================
Comment at: lib/Support/CommandLine.cpp:545
@@ +544,3 @@
+
+  // This is a small state machine to consume characters until it reaches the
+  // end of the source string.
----------------
Hm, this looks nicer than the way I wrote the GNU tokenizer.  :)

================
Comment at: unittests/Support/CommandLineTest.cpp:137
@@ -138,1 +136,3 @@
+  parse(Input, Saver, Actual);
+  EXPECT_EQ(OutputSize, Actual.size());
   for (unsigned I = 0, E = Actual.size(); I != E; ++I) {
----------------
Huh, I wasn't aware that EXPECT_EQ works in a free helper function.


http://llvm-reviews.chandlerc.com/D1235



More information about the llvm-commits mailing list