[llvm-dev] RFC: Support for preferring paths with forward slashes on Windows

Martin Storsjö via llvm-dev llvm-dev at lists.llvm.org
Thu Oct 14 05:22:17 PDT 2021


Hi,

When using Clang on Windows as a drop-in replacement for GCC, one issue 
that crops up fairly soon is that not all callers can tolerate paths 
spelled out with backslashes.

This is an issue when e.g. libtool parses the output of "$CC -v" (where 
clang passes an absolute path to compiler-rt libraries) and uses parts of 
that in shell script contexts that don't tolerate backslashes, when some 
callers call "$CC --print-search-dirs", etc.

This is also one of the most important things that MSYS2 patches in their 
distribution of Clang/LLVM according to their patch tracker [1].

(I've locally worked around this in my distribution without patching, by 
filtering clang's stdout in a wrapper, when options like "-v" or 
"--print-search-dirs" are detected, but that's essentially the same as 
patching.)

I've finally taken the plunge and tried to implement this properly. I've 
got a decent patch set [2] that I could start sending for review, but 
before doing that, I'd like to discuss the overall design.


The main idea is that I add a third alternative to path::Style - in 
addition to the existing Windows and Posix path styles, I'm adding 
Windows_forward, which otherwise parses and handles Windows paths like 
before (i.e. accepting and interpreting both separators), but with a 
different preferred separator (as returned by get_separator()).

This allows any code on any platform to handle paths in all three forms, 
just like in the existing design, when explicitly giving a path::Style 
argument.

To actually make it have effect, one can make path::Style::native act like 
Windows_forward instead of plain Windows. I'm not entirely sure what the 
best strategy is for when to do that - one could do it when LLVM itself 
was built for a MinGW target (which kind of breaks the assumption that the 
tools work pretty much the same as long as one passes the right --target 
options etc), or one could maybe set it up as a configure time CMake 
option? Or even make it a globally settable option in the process, to 
allow changing it e.g. depending on the tool's target configuration?

I also faintly remember that Reid at some point implied that it could be 
an option to switch all Windows builds outright to such a behaviour?

Most of the code is entirely independent of the policy decision of 
when/where to enable the behaviour - the decision is centralised to one 
single spot in LLVMSupport.

In any case, with this design and a quite moderate amount of fixups, most 
of the tests in check-all seem to pass, if switching the preference.

There's a couple tests that fail due to checking e.g. the literal paths %s 
or %t (as output by llvm-lit, with backslashes) against paths that the 
tools output. There's also a dozen or so of tests in Clang (mainly 
regarding PCH) that seem to misbehave when the same paths are referred to 
with varying kinds of slashes, e.g. stored with a forward slash in the PCH 
but referred to with backslashes in arguments to Clang, where paths are 
essentially equal but the strings differ. (For actual use with PCH, Clang 
built this way seems to work - and MSYS2 have been running with tools 
patched this way for quite some time, and I haven't heard about reports 
about bugs relating to that patch.)

If the design seems sane (have a look at [2] if you want to have a look at 
my whole series at the moment) I'd start sending the initial patches for 
review.

// Martin

[1] https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-clang/README-patches.md

[2] https://github.com/llvm/llvm-project/compare/main...mstorsjo:path-separator


More information about the llvm-dev mailing list