[PATCH] D51159: [FileManager] Do not call 'real_path' in getFile().

Simon Marchi via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Aug 23 13:13:25 PDT 2018


simark added inline comments.


================
Comment at: lib/Basic/FileManager.cpp:319
 
-  SmallString<128> RealPathName;
-  if (!FS->getRealPath(InterndFileName, RealPathName))
-    UFE.RealPathName = RealPathName.str();
+  if (UFE.File) {
+    if (auto Path = UFE.File->getName()) {
----------------
ioeric wrote:
> simark wrote:
> > ioeric wrote:
> > > simark wrote:
> > > > ioeric wrote:
> > > > > simark wrote:
> > > > > > What's the rationale for only computing the field if `UFE.File` is non-null?
> > > > > > 
> > > > > > Previously, if you looked up the file with `openFile == false` and then later `openFile == true`, the `RealPathName` field would not be set because of this.  That doesn't seem right.
> > > > > There has been no guarantee that RealFilePath is always set. I think that's the reason why the acceasor is called tryGetRealPathName.
> > > > The way I understood it was that it could be empty because computing the real path can fail.  Not just because we didn't skipped computing it.
> > > I agree that the API is confusing and lack of documentation (and we should fix), but the previous implementation did skip the computation if File is not available though. 
> > > I agree that the API is confusing and lack of documentation (and we should fix), but the previous implementation did skip the computation if File is not available though.
> > 
> > Did it have a reason not to?  What is the `RealPathName` field useful for, if it's unreliable?
> I think the biggest concern is the performance.
>  
> For example, clangd code completion latency increased dramatically with `real_path`:
> With `real_path`:
> {F7039629}
> {F7041680}
> 
> Wihtou `real_path`:
> {F7039630}
But with this patch, we are not using real_path anymore, so it can't be the reason for not computing `RealPathName` when not opening the file.  Since the non-real_path method is much more lightweight, would it make a significant difference if we always computed the field?

In the end I don't really mind, I am just looking for a rational explanation to why it's done this way.


================
Comment at: lib/Basic/FileManager.cpp:326
+      llvm::sys::path::remove_dots(AbsPath, /*remove_dot_dot=*/true);
+      UFE.RealPathName = AbsPath.str();
+    }
----------------
ioeric wrote:
> simark wrote:
> > ioeric wrote:
> > > simark wrote:
> > > > ioeric wrote:
> > > > > simark wrote:
> > > > > > If the path contains symlinks, doesn't this put a non-real path in the RealPathName field?  Won't users (e.g. clangd) use this value thinking it is a real path, when it is actually not?
> > > > > This was the original behavior. In general, File Manager should never call real_path for users because it can be very expensive. Users should call real_path if they want to resolve symlinks. That said, it's fair to say that "RealPathName" is just a wrong name, and we should clean it up at some point.
> > > > Ok, then if the goal is not to actually have a real path (in the realpath(3) sense), that's fine.  But I think we should rename the field sooner than later, it's really confusing.
> > > > 
> > > > That also means that it's kind of useless for us in clangd, so we should always call real_path there and not rely on that field.
> > > > Ok, then if the goal is not to actually have a real path (in the realpath(3) sense), that's fine. But I think we should rename the field sooner than later, it's really confusing.
> > > +1
> > > 
> > > > That also means that it's kind of useless for us in clangd, so we should always call real_path there and not rely on that field.
> > > I guess it depends on whether you want symlink resolution or not. I know that clangd's index symbol collector resolves symlink explicitly, but I don't think clangd enforces symlink resolution in general?
> > > I guess it depends on whether you want symlink resolution or not. I know that clangd's index symbol collector resolves symlink explicitly, but I don't think clangd enforces symlink resolution in general?
> > 
> > If we don't, we probably risk having duplicate results similar to what
> > 
> >   https://reviews.llvm.org/D48687
> > 
> > fixed, by with paths differing because of symlinks instead of dot-dots.
> Was the bug addressed in D48687 actually caused by symlinks though? If want we want is absolute paths with dotdot cleaned, it should be much cheaper to call `VFS::makeAbsolutePath` with `remove_dot_dot`.
> 
> In general, it's unclear whether clangd should resolve symlink (it might not always be what users want), and it should probably be a decision made by the build system integration. I think we would need a much more careful design if we decide to handle symlinks in clangd. 
> Was the bug addressed in D48687 actually caused by symlinks though?

No, it was caused by non-relative path that included dotdots, I was extrapolating.

> If want we want is absolute paths with dotdot cleaned, it should be much cheaper to call VFS::makeAbsolutePath with remove_dot_dot.

Agreed.

> In general, it's unclear whether clangd should resolve symlink (it might not always be what users want), and it should probably be a decision made by the build system integration. I think we would need a much more careful design if we decide to handle symlinks in clangd.

Indeed, if those paths end up displayed in the UI, it could be preferrable to not resolve the symlinks.  Using the real_path was kind of the "atomic bomb" solution to get rid of duplicates.


Repository:
  rC Clang

https://reviews.llvm.org/D51159





More information about the cfe-commits mailing list