[PATCH] D99406: RFC [lit] Detect if processes to execute are MSys based, apply custom quoting logic

Reid Kleckner via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 5 13:26:21 PDT 2021


rnk added inline comments.


================
Comment at: llvm/utils/lit/lit/TestRunner.py:221-226
+    Unfortunately, it seems to be impossible to quote complex arguments in a
+    way that both the MSVC CRT and MSys would interpret the same way
+    consistently. Quoting arguments affect how MSys interpret certain chars
+    like backslashes. One can also set the env var MSYS=noglob to inhibit
+    the globbing process (which affects how chars are quoted), but that
+    breaks a few cases where the globbing handling is required:
----------------
I want to try and find a way to challenge this finding, it's not consistent with what I found when I looked into this years ago, and, really, I just don't want it to be true, for obvious reasons. :)

What cases require msys globbing? I thought the whole reason that we implemented globbing in the lit shell was because regular win32 tools don't implement globbing.

Is MSys glob support something new, something that came after D26009 (2016)?


================
Comment at: llvm/utils/lit/lit/TestRunner.py:232
+    2 a\b\\c\\\\d     a\b\\c\\\\d    a\b\\c\\\\d        a\b\\c\\\\d
+    3 a\"b            a\b            a\b                a"b
+    4 "a[b\c"         a[b\c          a[b\c              a[b\c
----------------
I think this is the case I encountered when I added this custom quoting logic. The original needquote condition is:
https://github.com/python/cpython/blob/master/Lib/subprocess.py#L568
        needquote = (" " in arg) or ("\t" in arg) or not arg

I added a check for `"`, so that in this case, the input will be `"a\"b"`, and both an msys-no-glob and win32 app will see `a"b`.


================
Comment at: llvm/utils/lit/lit/TestRunner.py:235
+    5 "a\b\\c\\\\d"   a\b\c\\d       a\b\\c\\\\d        a\b\\c\\\\d
+    6 "a\"b"          a"b            a\b                a"b
+
----------------
Are you sure the msys-noglob case here is correct? That seems wrong...

I have always believed that the correct, safe way to quote anything on Windows is to:
- surround each argument in double quotes
- backslash escape every `\` and `"` character

Even though this escapes strictly more than is necessary, you usually get the same results. This is for example the algorithm used to produce clang crash reproducer scripts, which is nice because they usually run in both cmd and bash.

Can we file a bug against msys about this? I feel like the app really should see `a"b` in this case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99406/new/

https://reviews.llvm.org/D99406



More information about the llvm-commits mailing list