[PATCH] D87667: [Support/Path] Add path::is_gnu_style_absolute

Vinicius Tinti via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 15 18:52:43 PDT 2020


tinti added a comment.





================
Comment at: llvm/include/llvm/Support/Path.h:465-468
+/// GNU style absolute define that:
+/// 1) Paths starting with '/' are always absolute;
+/// 2) On Windows, paths starting with '\\' are absolute;
+/// 3) On Windows, paths starting with drive letter pattern are absolute.
----------------
jhenderson wrote:
> 
Ok.


================
Comment at: llvm/include/llvm/Support/Path.h:470
+///
+/// @param path Input path.
+/// @result True if the path is GNU style absolute, false if it is not.
----------------
jhenderson wrote:
> You need to explain the purpose of the `style` parameter too. I don't know why this is missing from the other doc comments around here.
The `style` will allow one to behave like Linux running o Windows or vice-versa.

Yes. They are missing indeed. That is why I didn't even thought in putting them.


================
Comment at: llvm/lib/Support/Path.cpp:690
 
+bool is_gnu_style_absolute(const Twine &Path, Style Style) {
+  SmallString<128> PathStorage;
----------------
jhenderson wrote:
> I'd put this function adjacent to `is_absolute`, like it is in the header.
> 
> Also, I think you can use the same naming style for variables as in `is_absolute` (i.e. lowerCameCase).
> 
> Rather than implement this whole function from scratch, maybe you could just do the bit that's different from `is_absolute` first, and then just call `is_absolute`?
> I'd put this function adjacent to `is_absolute`, like it is in the header.

Ok.

> Also, I think you can use the same naming style for variables as in `is_absolute` (i.e. lowerCameCase).

Ok. Sorry clang-tidy.

> Rather than implement this whole function from scratch, maybe you could just do the bit that's different from `is_absolute` first, and then just call `is_absolute`?

I tried. And I intentionally avoided it. My original implementation called `is_absolute` at the end.
Now I don't remember if it fails in one of my tests (which I removed because I tested with the data from `TEST(Support, Path)`) or if it is not needed.

My argument is that It needs to be as readable as possible. I tried to red `is_absolute` and may:

-- call has_root_directory()
---- call root_directory()
------ create an iterator()
------ call is_separator()
-- call has_root_name()
---- call root_name()
------ create an iterator()
------ call is_separator() 

I think is too much complexity to one thing that must be false. After reading the code from GNU it is more clear. The `#ifdef` only sets or unsets `dos_based`.

See below:

```
#define IS_DIR_SEPARATOR_1(dos_based, c)				\
  (((c) == '/')								\
   || (((c) == '\\') && (dos_based)))

#define HAS_DRIVE_SPEC_1(dos_based, f)			\
  ((f)[0] && ((f)[1] == ':') && (dos_based))

#define IS_ABSOLUTE_PATH_1(dos_based, f)		 \
  (IS_DIR_SEPARATOR_1 (dos_based, (f)[0])		 \
   || HAS_DRIVE_SPEC_1 (dos_based, f))
```

Comparing with my rules:

1) Paths starting with '/';

```
#define IS_DIR_SEPARATOR_1(dos_based, c)				\
  (((c) == '/')								\
   || (((c) == '\\') && (dos_based)))
```

2) On Windows, paths starting with '\\';

```
#define IS_DIR_SEPARATOR_1(dos_based, c)				\
  (((c) == '/')								\
   || (((c) == '\\') && (dos_based)))
```

3) On Windows, paths starting with a drive letter pattern.

```
#define HAS_DRIVE_SPEC_1(dos_based, f)			\
  ((f)[0] && ((f)[1] == ':') && (dos_based))

#define IS_ABSOLUTE_PATH_1(dos_based, f)		 \
  (IS_DIR_SEPARATOR_1 (dos_based, (f)[0])		 \
   || HAS_DRIVE_SPEC_1 (dos_based, f))
```

Any other condition must be false. I don't thin is worth to call `is_absolute`.


================
Comment at: llvm/lib/Support/Path.cpp:695
+  // Handle '/' which should be true regardless Windows or POSIX.
+  if (!P.empty() && is_separator(P.front(), Style::posix))
+    return true;
----------------
I wanted to even avoid calling `is_separator`.

What do you think?


================
Comment at: llvm/lib/Support/Path.cpp:700
+    // Handle '\\' on Windows.
+    if (!P.empty() && is_separator(P.front(), Style::windows))
+      return true;
----------------
`/` has already been tested. There is no need to repeat here.


================
Comment at: llvm/lib/Support/Path.cpp:704
+    // Handle drive letter pattern (i.e "c:") on Windows.
+    if (P.size() >= 2 && (P[0] && P[1] == ':'))
+      return true;
----------------
rengolin wrote:
> Neither `7:` nor `$:` are valid paths on Windows, I think.
> 
> Perhaps `std::isalpha(P[0])` would help.
> Neither `7:` nor `$:` are valid paths on Windows, I think.
> 
> Perhaps `std::isalpha(P[0])` would help.

I thought it too but the code on GNU does not check `isalpha`.

```
#define HAS_DRIVE_SPEC_1(dos_based, f)			\
  ((f)[0] && ((f)[1] == ':') && (dos_based))
```

The Wikipedia also points an extension [1].



```
If access to more filesystems than Z: is required under Windows NT, Volume Mount Points must be used.[11] However, it is possible to mount non-letter drives, such as 1:, 2:, or !: using the command line SUBST utility in Windows XP or later (i.e. SUBST 1: C:\TEMP), but it is not officially supported and may break programs that assume that all drives are letters A: to Z:.
```

There is code on LLDB that does the way you suggested [2].

[1] https://en.wikipedia.org/wiki/Drive_letter_assignment#Common_assignments
[2] https://github.com/llvm/llvm-project/blob/bc9b6b33a0d5760370e72ae06c520c25aa8d61c6/lldb/source/Utility/FileSpec.cpp#L315


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87667/new/

https://reviews.llvm.org/D87667



More information about the llvm-commits mailing list