[PATCH] D76869: [Clang] Restore replace_path_prefix instead of startswith

Adrian McCarthy via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 5 15:40:41 PDT 2020


amccarth added a comment.

I think this is pretty close now.



================
Comment at: llvm/lib/Support/Path.cpp:510
+        return false;
+      if (!SepPath && toLower(Path[I]) != toLower(Prefix[I]))
+        return false;
----------------
amccarth wrote:
> MaskRay wrote:
> > MaskRay wrote:
> > > I am sure toLower works when Unicode is considered.
> > > 
> > > @amccarth has made many useful comments in this area before, so I added him as a blocking reviewer.
> > s/am sure/am not sure/
> If I recall correctly, the case insensitivity in Windows paths is not locale-aware.  In fact, NTFS has a hard-coded case table.  So that's not an issue.  Also not an issue is Unicode normalization.  I'm pretty sure you can name one file "Ä" and another "Ä" and they will remain distinct because one is a pre-composited code point and the other is a regular "A" with a combining mark.  So that's another thing you don't have to worry about.
> 
> But it looks like `toLower` only handles basic ASCII, so, yeah, Unicode path names won't benefit from this attempt at case-blind comparison.  I'm not sure what facilities LLVM has for case folding for the rest of the BMP.  Maybe this is good enough for now?  I don't have any data on how common it is for people to expect proper case-blind comparisons of non-ASCII characters in Windows paths.
I don't have an opinion on how to handle the Unicode/case issues.  I don't know how important "perfect" case handling is or whether it should block "good enough" given that we don't have it know and this will help many.  So I'm not going to block on that unless somebody with more info or stronger opinions chimes in.

[I just read that Windows 10 is now letting people experiment with case-sensitive portions of a file system.  So who knows where this is going in the future!]


================
Comment at: llvm/lib/Support/Path.cpp:519
 bool replace_path_prefix(SmallVectorImpl<char> &Path, StringRef OldPrefix,
-                         StringRef NewPrefix) {
+                         StringRef NewPrefix, Style PrefixStyle) {
   if (OldPrefix.empty() && NewPrefix.empty())
----------------
I like that this has the style parameter now.  Thanks!

But naming it `PrefixStyle` seems odd.  It suggests that it applies to all of the `OldPrefix` and `NewPrefix` but not the `Path`.  Can you call it simply `style` to be consistent with most everything else in this path support file?


================
Comment at: llvm/unittests/Support/Path.cpp:1342
+  Found = path::replace_path_prefix(Path, OldPrefixWin, NewPrefix,
+                                    path::Style::windows);
+  EXPECT_TRUE(Found);
----------------
No `#ifdef` in the test now.  Yay!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76869/new/

https://reviews.llvm.org/D76869





More information about the llvm-commits mailing list