[PATCH] D60980: [libFuzzer] Replace -seed_corpus to better support fork mode on Win

Thu Apr 25 17:59:22 PDT 2019

kcc added inline comments.

================
Comment at: compiler-rt/lib/fuzzer/FuzzerDriver.cpp:771
+    std::istringstream SeedInputsStream(FileToString(SeedInputsFile));
+    RemoveFile(SeedInputsFile);
+    std::string seed_input;
----------------
metzman wrote:
> Removing the file is somewhat hostile to users but is the best way to prevent the accumulation of files in fork mode.
> Since the primary purpose of -seed_inputs_file is for fork mode, I think this is acceptable. Thoughts?
A bit too hostile indeed, and in this case the file is deleted by a process that didn't create it, making it more confusing. 
I'd prefer to delete the file in the same place we delete other temporary files for a child.

================
Comment at: compiler-rt/lib/fuzzer/FuzzerFlags.def:23
   "size up to max_len.")
-FUZZER_FLAG_STRING(seed_inputs, "A comma-separated list of input files "
-  "to use as an additional seed corpus")
+FUZZER_FLAG_STRING(seed_inputs_file, "A file containing a comma-separated list "
+  "of input files to use as an additional seed corpus.")
----------------
I found this flag to be useful by itself, outside the fork mode, so instead of replacing it with a new flag, 
I'd suggest enhancing it's behavior: 
  * --seed_inputs=aaa,bbb keeps working as is
  * --seed_inputs=@file_path reads the list from the file

================
Comment at: compiler-rt/lib/fuzzer/FuzzerFork.cpp:126
+      std::string SeedsFile = "seeds-list." + std::to_string(JobId);
+      WriteToFile(reinterpret_cast<const uint8_t*>(Seeds.c_str()), Seeds.size(), SeedsFile);
+      Cmd.addFlag("seed_inputs_file", SeedsFile);
----------------
for readability, I'd prefer to introduce another variant of WriteToFile:
  void WriteToFile(const std::string &Str, const std::string &Path);

================
Comment at: compiler-rt/test/fuzzer/cross_over.test:18
 # Test the same thing but using -seed_inputs instead of passing the corpus dir.
-RUN: not %run %t-CrossOverTest -max_len=10 -seed=1 -runs=10000000 -seed_inputs=%t-corpus/A,%t-corpus/B
+RUN: python -c "import sys; sys.stdout.write(r'%t-corpus/A,%t-corpus/B')" > %t.seed-inputs
+RUN: not %run %t-CrossOverTest -max_len=10 -seed=1 -runs=10000000 -seed_inputs_file=%t.seed-inputs
----------------
metzman wrote:
> The reason why I do this hacky python thing is because echo leaves a trailing newline and printf didn't work well with the percent formatting.
> Is it OK to use python for this purpose (I think python is necessary to run lit commands anyway)?
> If not, I can use `echo` so long as I end with a trailing `,`, that way the last seed input is "\n" instead of a newline being added to a filename we actually care about. Does this sound better than the python hack?
no need to change this test with the change I proposed. 

================
Comment at: compiler-rt/test/fuzzer/seed_inputs_file.test:4
+USE-2: INFO: seed corpus: files: 2
+RUN: python -c "import sys; sys.stdout.write(r'%t-SimpleTest,%t-SimpleTest')" > %t.seed-inputs
+RUN: %run %t-SimpleTest -runs=1 -seed_inputs_file=%t.seed-inputs 2>&1 | FileCheck %s --check-prefix=USE-2
----------------
will echo -n work in this case? 
python is indeed a bit unwelcome where. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60980/new/

https://reviews.llvm.org/D60980