[PATCH] D19842: In openFileForRead, attempt to fetch the actual name of the file on disk -- including case -- so that clang can later warn about non-portable #include and #import directives.

Eric Niebler via llvm-commits llvm-commits at lists.llvm.org
Tue May 3 17:05:53 PDT 2016


eric_niebler added inline comments.

================
Comment at: lib/Support/Unix/Path.inc:522
@@ +521,3 @@
+  // Attempt to get the real name of the file, if the user asked
+  if(RealPath) {
+    RealPath->clear();
----------------
majnemer wrote:
> I'd go for an early return to reduce indentation.
OK.

================
Comment at: lib/Support/Unix/Path.inc:524-547
@@ +523,26 @@
+    RealPath->clear();
+  #if 0 //__APPLE__
+    // On MacOS, fgetattrlist is the most reliable way to get a properly-cased
+    // path.
+    // BUGBUG commented out since fgetattrlist/ATTR_CMN_NAME seems to only
+    // fetch the file *name* and not the path.
+    struct attrlist attrList;
+    memset(&attrList, 0, sizeof(attrList));
+    attrList.bitmapcount = ATTR_BIT_MAP_COUNT;
+    attrList.commonattr = ATTR_CMN_NAME;
+    struct NameAttr {
+      u_int32_t totalByteCount;
+      struct attrreference attrRef;
+      char Name[NAME_MAX * 3 + 1]; // Large enough for NAME_MAX UTF8 characters
+    };
+    union {
+      NameAttr nameAttr;
+      char Buffer[sizeof(nameAttr)];
+    };
+    if (fgetattrlist(ResultFD, &attrList, Buffer, sizeof(Buffer), 0) != -1) {
+      assert(nameAttr.attrRef.attr_dataoffset == sizeof(struct attrreference));
+      if (0 < nameAttr.attrRef.attr_length)
+        RealPath->append(nameAttr.Name,
+                         nameAttr.Name + nameAttr.attrRef.attr_length - 1);
+    }
+  #elif defined(F_GETPATH)
----------------
majnemer wrote:
> Please remember to remove code which is disabled if you check it in.
Roger. I included it only because an earlier reviewer asked that I try it, and also in the hopes that someone can tell me how to make `fgetattrlist` do something useful. This API is **horrible** and I'll delete it gladly.

================
Comment at: lib/Support/Unix/Path.inc:554-556
@@ +553,5 @@
+      RealPath->append(Buffer, Buffer + strlen(Buffer));
+  #else
+    // If we have a /proc filesystem mounted, we ca quickly establish the
+    // real name of the file with readlink
+    static const bool HasProc = (0 == ::access("/proc/self/fd", R_OK));
----------------
majnemer wrote:
> This should be `#ifdef __linux__`
`/proc` works elsewhere, like cygwin for instance.

================
Comment at: lib/Support/Unix/Path.inc:557
@@ +556,3 @@
+    // real name of the file with readlink
+    static const bool HasProc = (0 == ::access("/proc/self/fd", R_OK));
+    char Buffer[PATH_MAX];
----------------
majnemer wrote:
> I wouldn't bother with this, it has a non-trivial amount of overhead on non-X86 platforms.  I'd just fall-back to `::realpath` if `::readlink` failed.
> 
> In fact, why not fall back to `::realpath` regardless of platform?
Since `HasProc` is `static const`, this check is executed exactly once. The result can be used on all file open operations besides the first.

This code tries to get the real name, but it doesn't try //hard//. It quickly determines one method to try, and if that fails it gives up. The reason is because (a) the real path is used only to issue a warning, (b) getting the real path (including case) is unreliable on just about all platforms, and `realpath` is the most unreliable of all the APIs I've tested. As a result, false negatives for this diagnostic are business as usual, and (c) I don't want to slow down file open.

================
Comment at: lib/Support/Unix/Path.inc:560-561
@@ +559,4 @@
+    if (HasProc) {
+      char ProcPath[64];
+      int N = snprintf(ProcPath, sizeof(ProcPath), "/proc/self/fd/%d", ResultFD);
+      assert(N > 0 && N < sizeof(ProcPath));
----------------
majnemer wrote:
> I think it'd be nicer to build a twine like so:
>   SmallString<32> ProcPathStorage;
>   StringRef ProcPath = Twine("/proc/self/fd/").concat(ResultFD).toNullTerminatedStringRef(ProcPathStorage);
Serious question: why is that nicer?

================
Comment at: lib/Support/Unix/Path.inc:563-565
@@ +562,5 @@
+      assert(N > 0 && N < sizeof(ProcPath));
+      ssize_t CharCount = ::readlink(ProcPath, Buffer, sizeof(Buffer));
+      if (CharCount > 0)
+        RealPath->append(Buffer, Buffer + CharCount);
+    } else {
----------------
majnemer wrote:
> What's the contract on failure?  Should `RealPath` be cleared?
`RealPath` //is// cleared on line 523.

================
Comment at: lib/Support/Windows/Path.inc:701
@@ +700,3 @@
+    DWORD CountChars =
+      ::GetFinalPathNameByHandleW(H, RealPathUTF16, MAX_PATH, 0);
+    if (CountChars > 0 && CountChars < MAX_PATH) {
----------------
majnemer wrote:
> Could we use `FILE_NAME_NORMALIZED` instead of `0`?
OK.

================
Comment at: lib/Support/Windows/Path.inc:702
@@ +701,3 @@
+      ::GetFinalPathNameByHandleW(H, RealPathUTF16, MAX_PATH, 0);
+    if (CountChars > 0 && CountChars < MAX_PATH) {
+      // Convert the result from UTF-16 to UTF-8.
----------------
majnemer wrote:
> Seeing as how we support extended-length paths, it'd be nice to continue supporting them here.  Perhaps we could retry similarly to what we do in `Process::GetEnv` ?
I had retry logic but I removed it. Like I say above, I try to get the real filename, but I don't try //hard// in the interest of keeping file open snappy.

================
Comment at: lib/Support/Windows/Path.inc:704
@@ +703,3 @@
+      // Convert the result from UTF-16 to UTF-8.
+      SmallVector<char, MAX_PATH> RealPathUTF8;
+      if (!UTF16ToUTF8(RealPathUTF16, CountChars, RealPathUTF8))
----------------
majnemer wrote:
> I'd use `SmallString<MAX_PATH> RealPathUTF8;`
OK.


http://reviews.llvm.org/D19842





More information about the llvm-commits mailing list