[PATCH] D58513: [libFuzzer][Windows] Port fork mode to Windows

Jonathan Metzman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 21 13:42:56 PST 2019


metzman added inline comments.


================
Comment at: compiler-rt/lib/fuzzer/FuzzerIOWindows.cpp:213
+      IterateDirRecursive(Path, DirPreCallback, DirPostCallback, FileCallback);
+    } else if (IsFile(PathAttrs) || IsLink(PathAttrs)) {
+      FileCallback(Path);
----------------
zturner wrote:
> zturner wrote:
> > zturner wrote:
> > > This doesn't look correct to me.  You can have a reparse point to something that itself isn't a file.  If it's a link, I think you need to check the reparse tag in `FindInfo.dwReserved0` as described [[ https://docs.microsoft.com/en-us/windows/desktop/api/minwinbase/ns-minwinbase-_win32_find_dataa | here ]]
> > > 
> > Another, perhaps less dirty way of handling the `IsLink()` case is to call `CreateFile()` on it with `FILE_FLAG_BACKUP_SEMANTICS`, which will cause it to follow symlinks, then call `GetFileInformationByHandle`.  That's actually probably the safest way, because I think you can have a reparse point which reparses to another reparse point, and rather than deal with all that stuff, it seems better to just let Windows follow the link for you, and then ask it what it opened.
> Clarity edit: `FILE_FLAG_BACKUP_SEMANTICS` will jsut cause it to not fail if it's a directory.  It follows symlinks by default (unless you specify `FILE_FLAG_OPEN_REPARSE_POINT`
Sorry I didn't see this round of comments before I posted mine.

Some context behind why I wrote things the way they are now: I'm trying to avoid calling the Windows File APIs more than necessary because (I think) they are too slow to be used like their posix counterparts.

I found (thanks to a bug report from a user) that when running libFuzzer on a large corpus, libFuzzer could spend ~50% of the time in `ReadDirToVectorOfUnits` and functions it calls, such as `ListFilesInDirRecursive`, `GetEpoch` and `CreateFile` (this seemed to crop up when using `-jobs=N`). 

Of course, if there's a correctness problem with the current approach, I can use the higher level API functions, but I wanted to see what you think of my most recent results before switching.

Maybe if we call `CreateFile` only on possible links, then perf won't be negatively impacted in 99% of cases since typically we will only be dealing with actual files.
For what it's worth, using `CreateFileA` with `FILE_FLAG_BACKUP_SEMANTICS` doesn't seem to work in all cases. 
If I use `IsFile` from line 26 instead of `IsLink` I get this error: `CreateFileA() failed for "<PATH>\test-dir\dir\file-symlink2" (Error code: 2)`
Whereas this file was handled correctly in the past.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D58513/new/

https://reviews.llvm.org/D58513





More information about the llvm-commits mailing list