[lldb-dev] FileSpec and normalization questions

Greg Clayton via lldb-dev lldb-dev at lists.llvm.org
Fri Apr 20 09:14:38 PDT 2018



> On Apr 20, 2018, at 1:08 AM, Pavel Labath <labath at google.com> wrote:
> 
> On Thu, 19 Apr 2018 at 19:20, Zachary Turner via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> 
> 
> 
>> On Thu, Apr 19, 2018 at 11:14 AM Greg Clayton via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> 
> 
>>> Also, looking at the tests for normalizing paths I found the following
> pairs of pre-normalized and post-normalization paths for posix:
> 
>>>       {"//", "//"},
>>>       {"//net", "//net"},
> 
>>> Why wouldn't we reduce "//" to just "/" for posix? And why wouldn't we
> reduce "//net" to "/net"?
> 
> 
>> I don't know what the author of this test had in mind, but from the POSIX
> spec:
> 
> 
> http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag_04_11
> 
>>> A pathname that begins with two successive slashes may be interpreted
> in an implementation-defined manner, although more than two leading slashes
> shall be treated as a single slash.
> 
> 
> Yes, that's exactly what the author of this test (me) had in mind. :)
> And it's not just a hypothetical posix thing either. Windows and cygwin
> both use \\ and // to mean funny things. I remember also seeing something
> like that on linux, though I can't remember now what was it being used for.

ok, we need to keep any paths starting with // or \\

> 
> This is also the same way as llvm path functions handle these prefixes, so
> I think we should keep them. I don't know whether we do this already, but
> we can obviously fold 3 or more consecutive slashes into one during
> normalization. Same goes for two slashes which are not at the beginning of
> the path.
> 
> 
> On Thu, Apr 19, 2018 at 11:14 AM Greg Clayton via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
>>>    {"./foo", "foo"},
>> Do we prefer to not have "./foo" to stay as "./foo"?
> 
> This is an interesting question. It basically comes down to our definition
> of "identical" FileSpecs. Do we consider "foo" and "./foo" to identical? If
> we do, then we should do the above normalization (theoretically we could
> choose a different normal form, and convert "foo" to "./foo", but I think
> that would be even weirder), otherwise we should skip it.
> 
> On one hand, these are obviously identical -- if you just take the string
> and pass it to the filesystem, you will always get back the same file. But,
> on the other hand, we have this notion that a FileSpec with an empty
> directory component represents a wildcard that matches any file with that
> name in any directory. For these purposes "./foo" and "foo" are two very
> different things.

> 
> So, I can see the case for both, and I don't really have a clear
> preference. All I would say is, whichever way we choose, we should make it
> very explicit so that the users of FileSpec know what to expect.

I would say that without a directory it is a wildcard match on base name alone, and with one, the partial directories must match if the path is relative, and the full directory must match if absolute. I will submit a patch that keeps leading "./" and "../" during normalization and we will see what people think.

> 
> On Thu, 19 Apr 2018 at 19:37, Zachary Turner via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
>> I think I might have tried to replace some of the low level functions in
> FileSpec with the LLVM equivalents and gotten a few test failures, but I
> didn't have time to investigate.  It would be a worthwhile experiment for
> someone to try again if they have some cycles.

I took a look at the llvm file stuff and it has llvm::sys::fs::real_path which always resolves symlinks _and_ normalizes the path. Would be nice to break it out into two parts by adding llvm::sys::fs::normalize_path and have llvm::sys::fs::real_path call it.

> I can try to take a look at it. The way I remember it, I just copied these
> functions from llvm and replaced all #ifdefs with runtime checks, which is
> pretty much what you later did in llvm proper. Unless there has been some
> significant divergence since then, it shouldn't be hard to reconcile these.

Ok, I will submit a patch and we will see how things go.


Greg



More information about the lldb-dev mailing list