[lldb-dev] FileSpec and normalization questions

Fri Apr 20 01:08:26 PDT 2018

On Thu, 19 Apr 2018 at 19:20, Zachary Turner via lldb-dev <
lldb-dev at lists.llvm.org> wrote:

> On Thu, Apr 19, 2018 at 11:14 AM Greg Clayton via lldb-dev <
lldb-dev at lists.llvm.org> wrote:

>> Also, looking at the tests for normalizing paths I found the following
pairs of pre-normalized and post-normalization paths for posix:

>>        {"//", "//"},
>>        {"//net", "//net"},

>> Why wouldn't we reduce "//" to just "/" for posix? And why wouldn't we
reduce "//net" to "/net"?

> I don't know what the author of this test had in mind, but from the POSIX
spec:

http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag_04_11

> > A pathname that begins with two successive slashes may be interpreted
in an implementation-defined manner, although more than two leading slashes
shall be treated as a single slash.

Yes, that's exactly what the author of this test (me) had in mind. :)
And it's not just a hypothetical posix thing either. Windows and cygwin
both use \\ and // to mean funny things. I remember also seeing something
like that on linux, though I can't remember now what was it being used for.

This is also the same way as llvm path functions handle these prefixes, so
I think we should keep them. I don't know whether we do this already, but
we can obviously fold 3 or more consecutive slashes into one during
normalization. Same goes for two slashes which are not at the beginning of
the path.

On Thu, Apr 19, 2018 at 11:14 AM Greg Clayton via lldb-dev <
lldb-dev at lists.llvm.org> wrote:
>>     {"./foo", "foo"},
> Do we prefer to not have "./foo" to stay as "./foo"?

This is an interesting question. It basically comes down to our definition
of "identical" FileSpecs. Do we consider "foo" and "./foo" to identical? If
we do, then we should do the above normalization (theoretically we could
choose a different normal form, and convert "foo" to "./foo", but I think
that would be even weirder), otherwise we should skip it.

On one hand, these are obviously identical -- if you just take the string
and pass it to the filesystem, you will always get back the same file. But,
on the other hand, we have this notion that a FileSpec with an empty
directory component represents a wildcard that matches any file with that
name in any directory. For these purposes "./foo" and "foo" are two very
different things.

So, I can see the case for both, and I don't really have a clear
preference. All I would say is, whichever way we choose, we should make it
very explicit so that the users of FileSpec know what to expect.

On Thu, 19 Apr 2018 at 19:37, Zachary Turner via lldb-dev <
lldb-dev at lists.llvm.org> wrote:
> I think I might have tried to replace some of the low level functions in
FileSpec with the LLVM equivalents and gotten a few test failures, but I
didn't have time to investigate.  It would be a worthwhile experiment for
someone to try again if they have some cycles.

I can try to take a look at it. The way I remember it, I just copied these
functions from llvm and replaced all #ifdefs with runtime checks, which is
pretty much what you later did in llvm proper. Unless there has been some
significant divergence since then, it shouldn't be hard to reconcile these.