[PATCH] D111879: [Support] Add a new path style for Windows with forward slashes

Aaron Ballman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 2 10:17:06 PDT 2021


aaron.ballman added a comment.

In D111879#3102724 <https://reviews.llvm.org/D111879#3102724>, @mstorsjo wrote:

> In D111879#3102636 <https://reviews.llvm.org/D111879#3102636>, @aaron.ballman wrote:
>
>> Sure can! I need to think about this for a bit though -- I'm not super comfortable initially, but I can become comfortable. Windows silently accepts forward slashes as path separators and has for a long time, but there are circumstances where the forward slashes are NOT supported too. Those cases should hopefully be rare, but it'd be good to know how we treat them. For example, Win32 file namespace path names do not undergo separator normalization (e.g., paths that start with `\\?\` so that they work with path components larger than max path, etc, such as `\\?\C:\Users\aballman\Desktop\test.c`). I think I'd prefer *not* supporting mixed path separators with that style of path because the system does not support it.
>
> FWIW, as `llvm::sys::path` currently already supports paths with either slash form (possibly mixed) - this keeps that behaviour, but adds another mode of operation, where the preferred separator is a forward slash (inserted whenever constructing paths, or when normalizing paths to that style). This particular patch in itself is entirely platform independent as it only adds the different mode for constucting paths (when one can choose to operate on paths in any mode on any platform).
>
> Admittedly, while one could have paths with mixed or forward slashes before, it probably was rather rare in practice, while this patch makes it possible to have it occur much more frequently.
>
> The follow-up, D111880 <https://reviews.llvm.org/D111880>, shows how the added facilities can be used to get an environment with mostly coherent forward slashes.
>
> Most of the handling of paths (inspecting paths, appending, etc) doesn't care about the individual separator form at all, but when we actually want to operate on files, we mostly pass through the `widenPath` function in llvm/lib/Support/Windows/Path.inc. That function already takes a UTF8 pathname (with possibly mixed separators) and converts it to UTF16 for use with the windows -W suffixed APIs, and possibly prepends a `\\?\` prefix if needed for long names.
>
> So unless we have something trying to use pathnames that bypass the `widenPath` function (which is needed to map UTF8 to the actual filesystem charset anyway), `\\?\` shouldn't be an issue. (However I realized one case in D111880 <https://reviews.llvm.org/D111880> that I should fix, unrelated to this patch.)

Fantastic, thank you for the explanation, that's very handy!



================
Comment at: llvm/lib/Support/Path.cpp:608
 StringRef get_separator(Style style) {
-  if (is_style_windows(style))
+  if (real_style(style) == Style::windows)
     return "\\";
----------------
Up on line 48, we use `is_style_windows()` instead of calling `real_style()`, but here we've flipped the edits. Is there a reason for the distinction?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111879/new/

https://reviews.llvm.org/D111879



More information about the llvm-commits mailing list