[PATCH] D122914: [Windows] Fix handling of \" in program name on cmd line.

Simon Tatham via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 1 09:23:02 PDT 2022


simon_tatham created this revision.
simon_tatham added reviewers: amccarth, rnk, ruiu, john.brawn.
Herald added subscribers: dexonsmith, hiraditya.
Herald added a project: All.
simon_tatham requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Bugzilla #47579: if you invoke clang on Windows via a pathname in
which a quoted section closes just after a backslash, e.g.

  "C:\Program Files\Whatever\"clang.exe

then cmd.exe and CreateProcess will correctly find the binary, because
when they parse the program name at the start of the command line,
they don't regard the \ before the " as having any kind of escaping
effect. This is different from the behaviour of the Windows standard C
library when it parses the rest of the command line, which would
consider that \" not to close the quoted string.

But this confuses windows::GetCommandLineArguments, because the
Windows API function GetCommandLineW() will return a command line
containing that \" sequence, and cl::TokenizeWindowsCommandLine will
tokenize the whole string according to the C library's rules. So it
will misidentify where the program name stops and the arguments start.

(If there are no further " on the command line, it might manage to
return zero arguments, causing an assertion failure later on in
GetCommandLineArguments when it tries to refer to Args[0]. But even if
that doesn't happen, it won't return the //right// arguments.)

To fix this, I've introduced a new variant function
cl::TokenizeWindowsCommandLineFull(), intended to be applied to the
string returned from GetCommandLineW(). It parses the first word of
the command line according to CreateProcess's rules, considering \ to
never be an escaping character; thereafter, it switches over to the C
library rules for the rest of the command line.

Also, included a final check before dereferencing Args[0], so that
just in case we //still// somehow get a zero-length argument vector,
we'll at least not crash.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D122914

Files:
  llvm/include/llvm/Support/CommandLine.h
  llvm/lib/Support/CommandLine.cpp
  llvm/lib/Support/Windows/Process.inc
  llvm/unittests/Support/CommandLineTest.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D122914.419784.patch
Type: text/x-patch
Size: 8840 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220401/cfb6b68f/attachment.bin>


More information about the llvm-commits mailing list