[PATCH] D87732: [Support] Provide sys::path::guess_style

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 17 01:10:54 PDT 2020


jhenderson added a comment.

In D87732#2278196 <https://reviews.llvm.org/D87732#2278196>, @phosek wrote:

> In D87732#2275943 <https://reviews.llvm.org/D87732#2275943>, @jhenderson wrote:
>
>> I'm not convinced this is really correct. After all, a path with mixed separators (e.g. '/my/path\to\foo') could be a Windows path rooted at the current drive. This mixed style can sometimes happen when things get concatenated together, so I don't think it's compeletely unreasonable.
>>
>> I'm also concerned that if we guess wrong, the behaviour will end up breaking. For example, if the path were a Windows path "/my/path\../to/foo", I'd expect a remove_dots call (without worrying about slash normalisation) to result in "/my/to/foo", but if the code is using Posix style, will it actually result in the path being left unchanged?
>>
>> I can think of two competing alternatives:
>>
>> 1. add an option to `remove_dots` to not do the separator canonicalization. I'm not sure whether this can be done unambiguously however - which separator should be removed when a directory is removed due to dots? The one before the removed parts? After them?
>> 2. change all '\' to '/' unconditionally. However, this might break Linux paths with '\' in.
>
> None of these solutions is perfect and I'm not even sure if there's one. Even if we improve `remove_dots`, there's still the existing issue with `sys::path::append`: which separator should we use when combining compilation dir, include dir and the file name.
>
> Maybe we should instead update Clang to do the normalization when generating debug info and always use `/`, and then always use `/` unconditionally when combining paths on the consumer side (but don't do any normalization)?

You're probably right about no solution being perfect. I'm not a fan of using '/' on Windows, mostly because it looks weird, but also because at least in the past for Windows cmd, '/' paths didn't auto-complete properly. It's only a minor gripe though (and not really an issue nowadays since I mostly use PowerShell). Normalisation to at least remove dots would make a lot of sense (strings become shorter), and possibly even normalising the separator direction does too, since that would make it more likely paths end up being identical and therefore poolable in .debug_str/.debug_line_str.

`sys::path::append` could probably unconditionally be Posix, since such paths will also be valid Windows paths always. Maybe we need some way of communicating target platform as an alternative to `native` style? If that were a command-line option, the caller at least would know which style it needs to work with.

I'd like to hear others opinions - @dblaikie?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87732/new/

https://reviews.llvm.org/D87732



More information about the llvm-commits mailing list