[libcxx-commits] [PATCH] D91176: [20/N] [libcxx] Implement parsing of root_name for paths on windows

Marek Kurdej via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Fri Jan 8 13:44:10 PST 2021


curdeius added inline comments.


================
Comment at: libcxx/src/filesystem/operations.cpp:202-205
+        PosPtr TkStart = consumeRootName(SepEnd, REnd);
+        if (TkStart == REnd)
+          return makeState(PS_InRootDir, RStart, RStart + 1);
         return makeState(PS_InTrailingSep, SepEnd + 1, RStart + 1);
----------------
mstorsjo wrote:
> curdeius wrote:
> > This looks strange. IIUC it means that you consumed a separator, and then, if `TkStart != nulltptr`, you consumed drive (e.g. "C:").
> > Can it happen anywhere? `/C:`?
> > 
> > Apart from that, TkStart returned by `consumeRootName` isn't used after the if (in `makeState`).
> > Shouldn't it be used instead of `SepEnd + 1` (if non null)?
> This is for the decrement function, which iterates backwards. So when parsing "c:/" backwards, this first consumes the separator. If we hit the end (which would happen for a path like "/" but not for "c:/"), it's considered a root dir. If it's not the last token, but followed by (parsing-wise, i.e. backwards) a root name, and that root name is at the start of the string, then the separator also is a root dir.
> 
> When parsing forward, the use of `consumeRootName` is easy as we only need to try to call it at the very beginning. When parsing backwards, we need to speculatively test it at a few places, in order to know how to classify a previous separator.
> 
> So here, the root name parsed by `consumeRootName` isn't actually consumed, i.e. we don't really use the `TkStart` it returns for returning parts of the string to the caller, but only for classifying the separator.
I have indeed forgotten about iterating backwards. Thank you for the explanation.


================
Comment at: libcxx/src/filesystem/operations.cpp:347-359
+  PosPtr consumeSeparators(PosPtr P, PosPtr End, int N) const noexcept {
+    PosPtr Ret = consumeSeparator(P, End);
+    if (Ret == nullptr)
+      return nullptr;
+    if (P < End) {
+      if (Ret == P + N)
+        return Ret;
----------------
mstorsjo wrote:
> curdeius wrote:
> > This function, when called with N==2 will consume e.g. `//`, `\\`, but also `/x`, and `\x`.
> > Is it intended?
> > I was rather expecting a loop calling `consumeSeparator` up to `N` times.
> The existing function `consumeSeparator` actually is a bit misnamed, it consumes one or more separators. This function checks that it consumed exactly `N` separators, otherwise we return null and don't consume anything.
> 
> Should this maybe be called `consumeExactlyNSeparators` for clarity? (That name is a bit clumsy though.)
Or just `consumeNSeparators`.
And then rename `consumeSeparator` to sth like `consumeAllSeparators`.
Maybe?


================
Comment at: libcxx/src/filesystem/operations.cpp:363
+    PosPtr Start = P;
+    if (P == nullptr || P == End || isSeparator(*P))
       return nullptr;
----------------
mstorsjo wrote:
> curdeius wrote:
> > I'm not a big fan of allowing P being nullptr.
> This (and the other case) was added for making `consumeNetworkRoot` more elegant. I could also spell it out like this:
> ```
>     if (P < End) {
>       P = consumeSeparators(P, End, 2);
>       if (P == nullptr)
>         return nullptr;
>       return consumeName(P, End);
>     } else {
>       P = consumeName(P, End);
>       if (P == nullptr)
>         return nullptr;
>       return consumeSeparators(P, End, 2);
>     }
> 
> ```
I see. That ain't pretty either.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91176/new/

https://reviews.llvm.org/D91176



More information about the libcxx-commits mailing list