[PATCH] D121733: Clean pathnames in FileManager.

Duncan P. N. Exon Smith via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Mar 18 18:16:39 PDT 2022


dexonsmith added a comment.

In D121733#3393640 <https://reviews.llvm.org/D121733#3393640>, @dexonsmith wrote:

> In D121733#3393552 <https://reviews.llvm.org/D121733#3393552>, @bnbarham wrote:
>
>> In D121733#3393546 <https://reviews.llvm.org/D121733#3393546>, @rnk wrote:
>>
>>> I've been somewhat afraid to touch this code because of symlinks. Is that something we need to worry about? Consider this path: root/symlink/../foo.h. remove_dots will turn this into root/foo.h, but if symlink points to some unrelated directory, the .. directory entry points to something other than root. I can't imagine that people rely on this behavior often, but I could be wrong.
>>
>> `remove_dots` doesn't remove `..` by default, it needs to be passed `true` to do that. This is only removing `.`.
>
> Yeah, I'd be against doing `..`-removal unless we had a cheap-enough way to detect if that was symlink-incorrect and skipped it in that case. But maybe it'd be worth a comment here explicitly saying `..`s were

A couple more thoughts:

- Probably we should change getDirectoryRef at the same time.
- Mabye this cleanup could mostly contained to there (rather than having it in both places).
  - Rely on `getDirectoryRef(parent_path(Filename))` to clean up the directory.
  - Construct "clean(er)" filename from `Dir->getName()` and `sys::path::filename(Filename))`.
  - Call `remove_leading_dotslash()` in case the dir is `"."`.
  - ... then go down and stat the file.
- I was a bit worried about the hacks for the VFS with "use-external-names=true" interfering with my suggestion above, but I looked and I think it's okay.
  - Hacks: see logic for when `Status.getName() != Filename`: `Filename` gets replaced with what `Status` returned.
  - Seems like `getDirectoryRef()` doesn't have this hack so I don't think it'll interfere here.
  - BTW, @bnbarham and I talked through a way to remove this hack (to model exposing the external name without replacing the access name); hopefully it won't stick around too much longer!
- It's probably not very hard/expensive to do `..`-removal correctly (correctly in all cases, by not removing `..`s unless `stat` proves it's safe).
  - Call `getDirectoryRef()` as above.
  - If there are any internal "..", try removing them and call `getDirectoryRef()` again. If they're successful and have the same inode, use the cleaner name.
  - The extra stat happens only once per directory that has a `..` component. Maybe not too bad?
  - ... again, probably better to do this internally to `getDirectoryRef()`
  - ... certainly seems better to land this as a second incremental step, since there's more potential for trouble.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121733/new/

https://reviews.llvm.org/D121733



More information about the cfe-commits mailing list