<div dir="ltr">Hi Bruno,<div><br></div><div>I had to revert this in r269100 because it was looking like the bot was going to be left red overnight.</div><div><br></div><div>Changes to this VFS code seem to have a trend of breaking on windows. Any idea why that is? I can understand things breaking on windows when writing low-level parts of an FS abstraction, but this patch seems fairly high-level. Is there a missing layering or something?</div><div><br></div><div>-- Sean Silva</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 10, 2016 at 11:43 AM, Bruno Cardoso Lopes via cfe-commits <span dir="ltr"><<a href="mailto:cfe-commits@lists.llvm.org" target="_blank">cfe-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: bruno<br>
Date: Tue May 10 13:43:00 2016<br>
New Revision: 269100<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=269100&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=269100&view=rev</a><br>
Log:<br>
[VFS] Reconstruct the VFS overlay tree for more accurate lookup<br>
<br>
The way we currently build the internal VFS overlay representation leads<br>
to inefficient path search and might yield wrong answers when asked for<br>
recursive or regular directory iteration.<br>
<br>
Currently, when reading an YAML file, each YAML root entry is placed<br>
inside a new root in the filesystem overlay. In the crash reproducer, a<br>
simple "@import Foundation" currently maps to 43 roots, and when looking<br>
up paths, we traverse a directory tree for each of these different<br>
roots, until we find a match (or don't). This has two consequences:<br>
<br>
- It's slow.<br>
- Directory iteration gives incomplete results since it only return<br>
results within one root - since contents of the same directory can be<br>
declared inside different roots, the result isn't accurate.<br>
<br>
This is in part fault of the way we currently write out the YAML file<br>
when emitting the crash reproducer - we could generate only one root and<br>
that would make it fast and correct again. However, we should not rely<br>
on how the client writes the YAML, but provide a good internal<br>
representation regardless.<br>
<br>
This patch builds a proper virtual directory tree out of the YAML<br>
representation, allowing faster search and proper iteration. Besides the<br>
crash reproducer, this potentially benefits other VFS clients.<br>
<br>
Modified:<br>
    cfe/trunk/lib/Basic/VirtualFileSystem.cpp<br>
    cfe/trunk/unittests/Basic/VirtualFileSystemTest.cpp<br>
<br>
Modified: cfe/trunk/lib/Basic/VirtualFileSystem.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Basic/VirtualFileSystem.cpp?rev=269100&r1=269099&r2=269100&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Basic/VirtualFileSystem.cpp?rev=269100&r1=269099&r2=269100&view=diff</a><br>
==============================================================================<br>
--- cfe/trunk/lib/Basic/VirtualFileSystem.cpp (original)<br>
+++ cfe/trunk/lib/Basic/VirtualFileSystem.cpp Tue May 10 13:43:00 2016<br>
@@ -719,7 +719,13 @@ public:<br>
                             Status S)<br>
       : Entry(EK_Directory, Name), Contents(std::move(Contents)),<br>
         S(std::move(S)) {}<br>
+  RedirectingDirectoryEntry(StringRef Name, Status S)<br>
+      : Entry(EK_Directory, Name), S(std::move(S)) {}<br>
   Status getStatus() { return S; }<br>
+  void addContent(std::unique_ptr<Entry> Content) {<br>
+    Contents.push_back(std::move(Content));<br>
+  }<br>
+  Entry *getLastContent() const { return Contents.back().get(); }<br>
   typedef decltype(Contents)::iterator iterator;<br>
   iterator contents_begin() { return Contents.begin(); }<br>
   iterator contents_end() { return Contents.end(); }<br>
@@ -747,6 +753,7 @@ public:<br>
     return UseName == NK_NotSet ? GlobalUseExternalName<br>
                                 : (UseName == NK_External);<br>
   }<br>
+  NameKind getUseName() const { return UseName; }<br>
   static bool classof(const Entry *E) { return E->getKind() == EK_File; }<br>
 };<br>
<br>
@@ -1023,6 +1030,70 @@ class RedirectingFileSystemParser {<br>
     return true;<br>
   }<br>
<br>
+  Entry *lookupOrCreateEntry(RedirectingFileSystem *FS, StringRef Name,<br>
+                             Entry *ParentEntry = nullptr) {<br>
+    if (!ParentEntry) { // Look for a existent root<br>
+      for (const std::unique_ptr<Entry> &Root : FS->Roots) {<br>
+        if (Name.equals(Root->getName())) {<br>
+          ParentEntry = Root.get();<br>
+          return ParentEntry;<br>
+        }<br>
+      }<br>
+    } else { // Advance to the next component<br>
+      auto *DE = dyn_cast<RedirectingDirectoryEntry>(ParentEntry);<br>
+      for (std::unique_ptr<Entry> &Content :<br>
+           llvm::make_range(DE->contents_begin(), DE->contents_end())) {<br>
+        auto *DirContent = dyn_cast<RedirectingDirectoryEntry>(Content.get());<br>
+        if (DirContent && Name.equals(Content->getName()))<br>
+          return DirContent;<br>
+      }<br>
+    }<br>
+<br>
+    // ... or create a new one<br>
+    std::unique_ptr<Entry> E = llvm::make_unique<RedirectingDirectoryEntry>(<br>
+        Name, Status("", getNextVirtualUniqueID(), sys::TimeValue::now(), 0, 0,<br>
+                     0, file_type::directory_file, sys::fs::all_all));<br>
+<br>
+    if (!ParentEntry) { // Add a new root to the overlay<br>
+      FS->Roots.push_back(std::move(E));<br>
+      ParentEntry = FS->Roots.back().get();<br>
+      return ParentEntry;<br>
+    }<br>
+<br>
+    auto *DE = dyn_cast<RedirectingDirectoryEntry>(ParentEntry);<br>
+    DE->addContent(std::move(E));<br>
+    return DE->getLastContent();<br>
+  }<br>
+<br>
+  void uniqueOverlayTree(RedirectingFileSystem *FS, Entry *SrcE,<br>
+                         Entry *NewParentE = nullptr) {<br>
+    StringRef Name = SrcE->getName();<br>
+    switch (SrcE->getKind()) {<br>
+    case EK_Directory: {<br>
+      auto *DE = dyn_cast<RedirectingDirectoryEntry>(SrcE);<br>
+      assert(DE && "Must be a directory");<br>
+      // Empty directories could be present in the YAML as a way to<br>
+      // describe a file for a current directory after some of its subdir<br>
+      // is parsed. This only leads to redundant walks, ignore it.<br>
+      if (!Name.empty())<br>
+        NewParentE = lookupOrCreateEntry(FS, Name, NewParentE);<br>
+      for (std::unique_ptr<Entry> &SubEntry :<br>
+           llvm::make_range(DE->contents_begin(), DE->contents_end()))<br>
+        uniqueOverlayTree(FS, SubEntry.get(), NewParentE);<br>
+      break;<br>
+    }<br>
+    case EK_File: {<br>
+      auto *FE = dyn_cast<RedirectingFileEntry>(SrcE);<br>
+      assert(FE && "Must be a file");<br>
+      assert(NewParentE && "Parent entry must exist");<br>
+      auto *DE = dyn_cast<RedirectingDirectoryEntry>(NewParentE);<br>
+      DE->addContent(llvm::make_unique<RedirectingFileEntry>(<br>
+          Name, FE->getExternalContentsPath(), FE->getUseName()));<br>
+      break;<br>
+    }<br>
+    }<br>
+  }<br>
+<br>
   std::unique_ptr<Entry> parseEntry(yaml::Node *N, RedirectingFileSystem *FS) {<br>
     yaml::MappingNode *M = dyn_cast<yaml::MappingNode>(N);<br>
     if (!M) {<br>
@@ -1225,6 +1296,7 @@ public:<br>
     };<br>
<br>
     DenseMap<StringRef, KeyStatus> Keys(std::begin(Fields), std::end(Fields));<br>
+    std::vector<std::unique_ptr<Entry>> RootEntries;<br>
<br>
     // Parse configuration and 'roots'<br>
     for (yaml::MappingNode::iterator I = Top->begin(), E = Top->end(); I != E;<br>
@@ -1247,7 +1319,7 @@ public:<br>
         for (yaml::SequenceNode::iterator I = Roots->begin(), E = Roots->end();<br>
              I != E; ++I) {<br>
           if (std::unique_ptr<Entry> E = parseEntry(&*I, FS))<br>
-            FS->Roots.push_back(std::move(E));<br>
+            RootEntries.push_back(std::move(E));<br>
           else<br>
             return false;<br>
         }<br>
@@ -1288,6 +1360,13 @@ public:<br>
<br>
     if (!checkMissingKeys(Top, Keys))<br>
       return false;<br>
+<br>
+    // Now that we sucessefully parsed the YAML file, canonicalize the internal<br>
+    // representation to a proper directory tree so that we can search faster<br>
+    // inside the VFS.<br>
+    for (std::unique_ptr<Entry> &E : RootEntries)<br>
+      uniqueOverlayTree(FS, E.get());<br>
+<br>
     return true;<br>
   }<br>
 };<br>
<br>
Modified: cfe/trunk/unittests/Basic/VirtualFileSystemTest.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/cfe/trunk/unittests/Basic/VirtualFileSystemTest.cpp?rev=269100&r1=269099&r2=269100&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/cfe/trunk/unittests/Basic/VirtualFileSystemTest.cpp?rev=269100&r1=269099&r2=269100&view=diff</a><br>
==============================================================================<br>
--- cfe/trunk/unittests/Basic/VirtualFileSystemTest.cpp (original)<br>
+++ cfe/trunk/unittests/Basic/VirtualFileSystemTest.cpp Tue May 10 13:43:00 2016<br>
@@ -1022,9 +1022,14 @@ TEST_F(VFSFromYAMLTest, DirectoryIterati<br>
   Lower->addDirectory("//root/");<br>
   Lower->addDirectory("//root/foo");<br>
   Lower->addDirectory("//root/foo/bar");<br>
+  Lower->addDirectory("//root/zab");<br>
+  Lower->addDirectory("//root/baz");<br>
   Lower->addRegularFile("//root/foo/bar/a");<br>
   Lower->addRegularFile("//root/foo/bar/b");<br>
   Lower->addRegularFile("//root/file3");<br>
+  Lower->addRegularFile("//root/zab/a");<br>
+  Lower->addRegularFile("//root/zab/b");<br>
+  Lower->addRegularFile("//root/baz/c");<br>
   IntrusiveRefCntPtr<vfs::FileSystem> FS =<br>
   getFromYAMLString("{ 'use-external-names': false,\n"<br>
                     "  'roots': [\n"<br>
@@ -1042,6 +1047,26 @@ TEST_F(VFSFromYAMLTest, DirectoryIterati<br>
                     "                  'external-contents': '//root/foo/bar/b'\n"<br>
                     "                }\n"<br>
                     "              ]\n"<br>
+                    "},\n"<br>
+                    "{\n"<br>
+                    "  'type': 'directory',\n"<br>
+                    "  'name': '//root/baz',\n"<br>
+                    "  'contents': [ {\n"<br>
+                    "                  'type': 'file',\n"<br>
+                    "                  'name': 'x',\n"<br>
+                    "                  'external-contents': '//root/zab/a'\n"<br>
+                    "                }\n"<br>
+                    "              ]\n"<br>
+                    "},\n"<br>
+                    "{\n"<br>
+                    "  'type': 'directory',\n"<br>
+                    "  'name': '//root/baz',\n"<br>
+                    "  'contents': [ {\n"<br>
+                    "                  'type': 'file',\n"<br>
+                    "                  'name': 'y',\n"<br>
+                    "                  'external-contents': '//root/zab/b'\n"<br>
+                    "                }\n"<br>
+                    "              ]\n"<br>
                     "}\n"<br>
                     "]\n"<br>
                     "}",<br>
@@ -1054,8 +1079,12 @@ TEST_F(VFSFromYAMLTest, DirectoryIterati<br>
<br>
   std::error_code EC;<br>
   checkContents(O->dir_begin("//root/", EC),<br>
-                {"//root/file1", "//root/file2", "//root/file3", "//root/foo"});<br>
+                {"//root/file1", "//root/file2", "//root/baz", "//root/file3",<br>
+                 "//root/foo", "//root/zab"});<br>
<br>
   checkContents(O->dir_begin("//root/foo/bar", EC),<br>
                 {"//root/foo/bar/a", "//root/foo/bar/b"});<br>
+<br>
+  checkContents(O->dir_begin("//root/baz", EC),<br>
+                {"//root/baz/x", "//root/baz/y", "//root/baz/c"});<br>
 }<br>
<br>
<br>
_______________________________________________<br>
cfe-commits mailing list<br>
<a href="mailto:cfe-commits@lists.llvm.org">cfe-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits</a><br>
</blockquote></div><br></div>