[llvm-dev] RFC: Support for preferring paths with forward slashes on Windows
Martin Storsjö via llvm-dev
llvm-dev at lists.llvm.org
Thu Oct 14 05:22:17 PDT 2021
Hi,
When using Clang on Windows as a drop-in replacement for GCC, one issue
that crops up fairly soon is that not all callers can tolerate paths
spelled out with backslashes.
This is an issue when e.g. libtool parses the output of "$CC -v" (where
clang passes an absolute path to compiler-rt libraries) and uses parts of
that in shell script contexts that don't tolerate backslashes, when some
callers call "$CC --print-search-dirs", etc.
This is also one of the most important things that MSYS2 patches in their
distribution of Clang/LLVM according to their patch tracker [1].
(I've locally worked around this in my distribution without patching, by
filtering clang's stdout in a wrapper, when options like "-v" or
"--print-search-dirs" are detected, but that's essentially the same as
patching.)
I've finally taken the plunge and tried to implement this properly. I've
got a decent patch set [2] that I could start sending for review, but
before doing that, I'd like to discuss the overall design.
The main idea is that I add a third alternative to path::Style - in
addition to the existing Windows and Posix path styles, I'm adding
Windows_forward, which otherwise parses and handles Windows paths like
before (i.e. accepting and interpreting both separators), but with a
different preferred separator (as returned by get_separator()).
This allows any code on any platform to handle paths in all three forms,
just like in the existing design, when explicitly giving a path::Style
argument.
To actually make it have effect, one can make path::Style::native act like
Windows_forward instead of plain Windows. I'm not entirely sure what the
best strategy is for when to do that - one could do it when LLVM itself
was built for a MinGW target (which kind of breaks the assumption that the
tools work pretty much the same as long as one passes the right --target
options etc), or one could maybe set it up as a configure time CMake
option? Or even make it a globally settable option in the process, to
allow changing it e.g. depending on the tool's target configuration?
I also faintly remember that Reid at some point implied that it could be
an option to switch all Windows builds outright to such a behaviour?
Most of the code is entirely independent of the policy decision of
when/where to enable the behaviour - the decision is centralised to one
single spot in LLVMSupport.
In any case, with this design and a quite moderate amount of fixups, most
of the tests in check-all seem to pass, if switching the preference.
There's a couple tests that fail due to checking e.g. the literal paths %s
or %t (as output by llvm-lit, with backslashes) against paths that the
tools output. There's also a dozen or so of tests in Clang (mainly
regarding PCH) that seem to misbehave when the same paths are referred to
with varying kinds of slashes, e.g. stored with a forward slash in the PCH
but referred to with backslashes in arguments to Clang, where paths are
essentially equal but the strings differ. (For actual use with PCH, Clang
built this way seems to work - and MSYS2 have been running with tools
patched this way for quite some time, and I haven't heard about reports
about bugs relating to that patch.)
If the design seems sane (have a look at [2] if you want to have a look at
my whole series at the moment) I'd start sending the initial patches for
review.
// Martin
[1] https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-clang/README-patches.md
[2] https://github.com/llvm/llvm-project/compare/main...mstorsjo:path-separator
More information about the llvm-dev
mailing list