[PATCH] Fixing a driver bug where clang cannot locate source file at drive root

Yunzhong Gao Yunzhong_Gao at playstation.sony.com
Tue Jul 23 14:56:52 PDT 2013

  Hi Aaron and Rafael,

  Many thanks for reviewing my patch. I did a little more investigation on how
  the file I/O works on LLVM/Clang.

  When I tested on Windows 7, ::open("C:test.c", O_RDONLY) and
  ::stat("C:test.c", &stat_buffer) return success. On the other hand,
  ::stat("C:") returns failure. With a trailing back slash, ::stat("C:\\")
  returns success.

  Rafael asked whether this is a clang-specifc problem or a llvm problem.
  It looks like clang uses FileManager::getFile() to open a file, which will
  first check for existence of the enclosing directory before trying to open
  the file. This first check is where the trailing back slash makes the
  difference. It is done by calling FileSystemStatCache::get() on the directory
  name, which then calls library function ::stat(). Eventually it uses
  MemoryBuffer::getFile() to open the file.

  I checked the other llvm tools and they do not checking for the enclosing
  directory as clang does:
    llvm-as uses llvm::ParseAssemblyFile() and then MemoryBuffer::getFile()
      to open a file. It does not check for existence of the enclosing
      directory, so the difference between stat("C:\\") and stat("C:") does
      not affect llvm-as here.
    bugpoint, llc, lli, llvm-diff, llvm-link, llvm-jitlistener and opt use
      llvm::ParseIRFile() and then MemoryBuffer::getFile() to open the input
      file. Again they do not check for existence of the enclosing directory.
    llvm-bcanalyzer, llvm-cov, llvm-mc, llvm-mcmarkup, llvm-nm, llvm-dwarfdump,
      llvm-prof, llvm-rtdyld, obj2yaml, yaml2obj, utils/FileCheck,
      utils/FileUpdate, llvm-tblgen, clang-tblgen also use
      MemoryBuffer->getFileOrSTDIN() and then MemoryBuffer->getFile() to open
      the input file without checking for existence of the enclosing directory.
    llvm-objdump, llvm-readobj, llvm-size and macho-dump use
      llvm::object::createBinary() and then MemoryBuffer::getFile() to open the
      input file without checking for existence of the enclosing file.
    llvm-ar uses Memory::getFile() to open the archive file. It does
      not check for existence of the enclosing file.
    llvm-config needs to compare current directory to development tree for
      equality, and when doing so, always gets absolute paths first.
    llvm-dis uses DataStreamer::OpenFile() and then sys::fs::openFileForRead(),
      which calls open() on Linux and CreateFileW() on Windows.
    llvm-extract uses llvm::getLazyIRFileModule() and then
      MemoryBuffer::getFile() to open the input file.
    llvm-stress takes no input file; it uses raw_fd_ostream::raw_fd_ostream()
      to open a file for write, which calls sys::fs::openFileForWrite().
    llvm-symbolizer uses LLVMSymbolizer::getOrCreateModuleInfo(), then
      getOrCreateBinary() => object::createBinary() => MemoryBuffer::getFile()
      to open the input file.
    utils/count reads from stdin.
  Because none of these llvm tools check for existence of the enclosing
  directory before opening their input files, they do not have the same problem
  as clang. So to answer Rafael's question, this is indeed specific to clang.

  My proposed patch is in FileManager::getDirectory() because I see some codes
  in the beginning of this function already trying to patch the directory name.
  Alternatively we could also try to fix up the directory name in llvm::sys::
  path::parent_path() or FileManager::getDirectoryFromFile(). Doing it in
  getDirectoryFromFile() could take the same approach as I did in getDirectory().
  Doing it in parent_path() might be a bit more tricky because to grow the
  string, some memory allocator is needed: I would like some advice on what is
  the best way to handle it.

  To answer Aaron's question, I used path::root_name() instead of
  path::root_path() so that the expression "DirName == root_name(DirName)"
  returns different results for "C:" and "C:\", where the fix is really only
  needed for the "C:" case.

  "C:test.c" is not directly passed to path::root_name(). Before entering
  FileManager::getDirectory(), where my patch is intended, path::parent_path()
  will be run on the file path to extract the directory name first before
  passing the directory name to path::root_name(), which returns "C:" for
  both C:\ and C:. Does this address your question, Aaron?

  I wrote a unit test thanks to Aaron's suggestion. I am adding it to
  clang/unittests/Basic/FileManagerTest.cpp. Could you take a look?

  Many thanks,
  - Gao.




Index: unittests/Basic/FileManagerTest.cpp
--- unittests/Basic/FileManagerTest.cpp
+++ unittests/Basic/FileManagerTest.cpp
@@ -125,6 +125,14 @@
   FakeStatCache *statCache = new FakeStatCache;
   statCache->InjectDirectory("/tmp", 42);
   statCache->InjectFile("/tmp/test", 43);
+#ifdef _WIN32
+  const char *DirName = "C:.";
+  const char *FileName = "C:test";
+  statCache->InjectDirectory(DirName, 44);
+  statCache->InjectFile(FileName, 45);
   const FileEntry *file = manager.getFile("/tmp/test");
@@ -134,6 +142,15 @@
   const DirectoryEntry *dir = file->getDir();
   ASSERT_TRUE(dir != NULL);
   EXPECT_STREQ("/tmp", dir->getName());
+#ifdef _WIN32
+  file = manager.getFile(FileName);
+  ASSERT_TRUE(file != NULL);
+  dir = file->getDir();
+  ASSERT_TRUE(dir != NULL);
+  EXPECT_STREQ(DirName, dir->getName());
 // getFile() returns non-NULL if a virtual file exists at the given path.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D937.3.patch
Type: text/x-patch
Size: 1029 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130723/3cfc3a92/attachment.bin>

More information about the cfe-commits mailing list