[Lldb-commits] [PATCH] D45977: Always normalize FileSpec paths.

Pavel Labath via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Tue Apr 24 09:22:34 PDT 2018


labath added inline comments.


================
Comment at: source/Utility/FileSpec.cpp:406
+      m_directory.Clear();
+  }
 }
----------------
clayborg wrote:
> labath wrote:
> > clayborg wrote:
> > > > they resolve to the same file in a virtual filesystem, where all referenced path components exist and are directories
> > > 
> > > Normalizing happens after resolving and if we don't resolve a path, we have no way to know what is a directory and what isn't. We will be setting breakpoints for remote targets quite a bit in LLDB and ww can't assume or stat anything in a path. So I would say FileSpec's are equivalent if the relative paths match all components.
> > > 
> > > > Now the (c) part would imply that foo/.. should normalize to ./. and not ., which is a bit odd, but it is consistent with our stated intention of preserving directory information. If we do not want to have ./. here, then we need to come up with a different definition of what it means to be "normalized".
> > > 
> > > I guess we could make the  m_directory contain "." and m_filename contain nothing for the "./." case. It doesn 't make  sense to have "." in both places.
> > >> they resolve to the same file in a virtual filesystem, where all referenced path components exist and are directories
> > 
> > > Normalizing happens after resolving and if we don't resolve a path, we have no way to know what is a directory and what isn't. We will be setting breakpoints for remote targets quite a bit in LLDB and ww can't assume or stat anything in a path.
> > 
> > Yes, I am aware of that. I am not saying this is how we should actually implement the normalization algorithm. I am trying define what a "normalization" is in the first place, so that we can then judge whether a particular normalization algorithm is good or not. I think defining normalization in terms of an actual filesystem makes sense, since at the end of the day, our algorithm should somehow approximate what happens in real file systems. I am not saying the algorithm should be doing any stats, but for the verification (either in our heads or in the tests) we can use certainly use stats or actual file systems.
> > 
> > > So I would say FileSpec's are equivalent if the relative paths match all components.
> > 
> > This is too vague to be useful. I have no idea how I would apply this definition to determine if e.g. "/foo/../bar.txt" and "./bar.txt" are equivalent. And you didn't say anything about how to derive the normal form for a `FileSpec`.
> > 
> > >> Now the (c) part would imply that foo/.. should normalize to ./. and not ., which is a bit odd, but it is consistent with our stated intention of preserving directory information. If we do not want to have ./. here, then we need to come up with a different definition of what it means to be "normalized".
> > 
> > > I guess we could make the m_directory contain "." and m_filename contain nothing for the "./." case. It doesn 't make sense to have "." in both places.
> > 
> > I don't think that is very useful, as then this would be the only special case where normalization would produce a FileSpec without a filename component.
> >> So I would say FileSpec's are equivalent if the relative paths match all components.
> 
> > This is too vague to be useful. I have no idea how I would apply this definition to determine if e.g. "/foo/../bar.txt" and "./bar.txt" are equivalent. And you didn't say anything about how to derive the normal form for a FileSpec.
> 
> "/foo/../bar.txt" would be normalized to "/bar.txt" and "./bar.txt" will, after switching to llvm's remove_dots, be normalized to "bar.txt". So those could be though of as equivalent since one is only a basename and would only need to match the filename. If you have "/foo/bar/baz.txt", it could be equivalent to "bar/baz.txt" by making sure the filename's match, and if either or both path is relative, then matching as many directories as are specified.
> 
> 
> >> I guess we could make the m_directory contain "." and m_filename contain nothing for the "./." case. It doesn 't make sense to have "." in both places.
> 
> > I don't think that is very useful, as then this would be the only special case where normalization would produce a FileSpec without a filename component.
> 
> Ok, then leave it as is with "." in filename, not directory. We don't need it in both places IMHO
Ok, if we start using llvm's `remove_dots` as our normalization algorithm, then both of these issues will become moot (and I believe I have a fairly good understanding of how `remove_dots` works).


================
Comment at: source/Utility/FileSpec.cpp:857
+        !IsPathSeparator(components[i].back(), syntax) &&
+        (result.empty() || !IsPathSeparator(result.back(), syntax)))
       result += GetPreferredPathSeparator(syntax);
----------------
clayborg wrote:
> labath wrote:
> > clayborg wrote:
> > > labath wrote:
> > > > Why is this necessary? `result` can't be empty because we have just appended something to it (and we've checked that `components[i]` is non-empty).
> > > Because if you don't check this joining {"/', "foo.txt"} will result in "//foo.txt" which is wrong,.
> > Yes, but didn't the original condition guard against that already?
> > 
> > We know that `components[i]` is non-empty, and we have just appended it to `result` two lines above. So, unless I am missing something, `result.back()` should be the same as `components[i].back()` and this additional check does not buy us anything.
> I did not. During testing I found this case and "/" + "main.c" was producing "//main.c"
I've just tried calling `join_path_components(ePathSyntaxPosix, {"/", "main.c"})`, and it produced `/main.c` even without your modifications.


https://reviews.llvm.org/D45977





More information about the lldb-commits mailing list