[PATCH] D51359: Adding HardLink Support to VirtualFileSystem.
    Ilya Biryukov via Phabricator via cfe-commits 
    cfe-commits at lists.llvm.org
       
    Mon Sep  3 02:19:46 PDT 2018
    
    
  
ilya-biryukov added a comment.
Thanks, the interface and implementation look good!
Last drop of nits and suggestions for code simplification.
================
Comment at: include/clang/Basic/VirtualFileSystem.h:359
+  public:
+  /// Add a HardLink to a File.
+  /// The To path must be an existing file or a hardlink. The From file must not
----------------
usaxena95 wrote:
> ilya-biryukov wrote:
> > Maybe document that it does nothing for directories?
> > Also, any reason to not support directories at this point? The limitation seem artificial at this point.
> Links to directories cannot be  added with the current logic. Resolving paths and iterating directories becomes non trivial. Current implementation supports the use case of "file with multiple names" which is mostly sufficient to mimic the compilation.
You're right. E.g. one potential trouble is cycles in the directory tree.
Let's leave them out, directory links do seem like some hard work and we don't have a use-case for them.
================
Comment at: include/clang/Basic/VirtualFileSystem.h:364
+  /// The To path must be an existing file or a hardlink. The From file must not
+  /// have been added before. The From Node is added as an InMemoryHardLink
+  /// which points to the resolved file of To Node. The From path or the To Path
----------------
InMemoryHardLink is a private implementation detail. Maybe remove it from the docs of the public API? 
================
Comment at: include/clang/Basic/VirtualFileSystem.h:369
+  /// successfully created, false otherwise.
+  bool addHardLink(const Twine &From, const Twine &To,
+                   Optional<uint32_t> User = None,
----------------
Maybe expand a little on what a "hard link" is in our implementation?
Specifically, the important bits seem to be:
They are not intended to be fully equivalent to hard links of the classical filesystems, despite the name.
Both the original file and a hard link have the same UniqueID returned in their stats, share the same buffer, etc.
There is no way to distinguish hard links and the original files after they were added.
(this is already in the docs) Only links to files are supported, directories cannot be linked.
================
Comment at: include/clang/Basic/VirtualFileSystem.h:370
+  bool addHardLink(const Twine &From, const Twine &To,
+                   Optional<uint32_t> User = None,
+                   Optional<uint32_t> Group = None,
----------------
Maybe remove these defaulted parameters? They're not used and we'll probably have confusing semantics even if we add support for them (since our hard links don't have their own status, and merely return the status of the original file).
WDYT?
================
Comment at: lib/Basic/VirtualFileSystem.cpp:476
   InMemoryNodeKind Kind;
-
-protected:
-  /// Return Stat.  This should only be used for internal/debugging use.  When
-  /// clients wants the Status of this node, they should use
-  /// \p getStatus(StringRef).
-  const Status &getStatus() const { return Stat; }
+  string FileName;
 
----------------
Please use std::string, we usually don't leave out the std:: namespace in LLVM.
================
Comment at: lib/Basic/VirtualFileSystem.cpp:523
+      : InMemoryNode(Path.str(), IME_HardLink), ResolvedFile(ResolvedFile) {}
+  const InMemoryFile *getResolvedFile() const { return &ResolvedFile; }
+
----------------
Maybe return a reference here too? To indicate this can't be null.
================
Comment at: lib/Basic/VirtualFileSystem.cpp:653
+  // Cannot create HardLink from a directory.
+  if (HardLinkTarget && (!isa<detail::InMemoryFile>(HardLinkTarget) || Buffer))
+    return false;
----------------
HardLinkTarget can only be `InMemoryFile`, so maybe change the type of the parameter to `const InMemoryFile*`?
Clients will be responsible for making sure the target is actually a file, but that seems fine, they already have to resolve the path to the target file.
================
Comment at: lib/Basic/VirtualFileSystem.cpp:665
         // End of the path, create a new file or directory.
         Status Stat(P.str(), getNextVirtualUniqueID(),
                     llvm::sys::toTimePoint(ModificationTime), ResolvedUser,
----------------
We don't use `Status` for links, maybe move creation of `Status` into the branch that handles files?
This would allow to assert `Buffer` is not null too
================
Comment at: lib/Basic/VirtualFileSystem.cpp:690
           getNextVirtualUniqueID(), llvm::sys::toTimePoint(ModificationTime),
-          ResolvedUser, ResolvedGroup, Buffer->getBufferSize(),
-          sys::fs::file_type::directory_file, NewDirectoryPerms);
+          ResolvedUser, ResolvedGroup, 0, sys::fs::file_type::directory_file,
+          NewDirectoryPerms);
----------------
Oh, wow, this used to set the size of the created parent directories to the size of the buffer we're creating. I guess nobody looks at it anyway, so we were never hitting any problems because of this.
Good catch, thanks for fixing this.
================
Comment at: lib/Basic/VirtualFileSystem.cpp:803
   auto Node = lookupInMemoryNode(*this, Root.get(), Path);
-  if (Node)
-    return (*Node)->getStatus(Path.str());
+  if (Node) {
+     assert(!isa<detail::InMemoryHardLink>(*Node));
----------------
NIT: maybe invert condition?
I.e. change to 
```
if (!Node)
  return Node.getError();
//.. rest of the code
```
to keep the amount of code with higher nesting small. See somewhat related section in [[ https://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code | LLVM Style Guide ]]
================
Comment at: lib/Basic/VirtualFileSystem.cpp:804
+  if (Node) {
+     assert(!isa<detail::InMemoryHardLink>(*Node));
+    if (auto Dir = dyn_cast<detail::InMemoryDirectory>(*Node))
----------------
This line seems to have wrong indent.
Running `git-clang-format` before submitting the code is a convenient way to fix this.
================
Comment at: lib/Basic/VirtualFileSystem.cpp:809
+      return File->getStatus(Path.str());
+  }
   return Node.getError();
----------------
NIT: add `llvm_unreachable()` to make it clear we don't have a 4th kind of node?
================
Comment at: lib/Basic/VirtualFileSystem.cpp:841
       llvm::sys::path::append(Path, I->second->getFileName());
-      CurrentEntry = I->second->getStatus(Path);
+      if (auto Dir = dyn_cast<detail::InMemoryDirectory>(I->second.get()))
+        CurrentEntry = Dir->getStatus(Path);
----------------
Maybe extract it into a small helper function?
It only pops up twice, but a helper should still give better readability for both cases.
================
Comment at: unittests/Basic/VirtualFileSystemTest.cpp:702
 
+MATCHER_P3(IsHardLinktTo, FS, Target, ExpectedBuffer, "") {
+  Twine From = arg;
----------------
s/IsHardLinktTo/IsHardLinkTo
================
Comment at: unittests/Basic/VirtualFileSystemTest.cpp:705
+  Twine To = Target;
+  auto OpenedFrom = FS->openFileForRead(From);
+  if (OpenedFrom.getError())
----------------
This matcher is pretty complicated and tests multiple things.
If it fails, it might be hard to track down what exactly went wrong.
Given the name, it seems reasonable for it to simply test that both paths point to the same unique-id.
We don't have to test every other invariant (sizes are the same, buffers are the same) on all the links we create, one test assertion per invariant should be enough (sizes are the same, buffers are the same, etc)
Repository:
  rC Clang
https://reviews.llvm.org/D51359
    
    
More information about the cfe-commits
mailing list