[PATCH] D41776: [lit] Implement "-r" option for builtin "diff" command + a test using that.

Max Moroz via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Jan 6 16:27:17 PST 2018


Dor1s marked an inline comment as done.
Dor1s added inline comments.


================
Comment at: utils/lit/lit/TestRunner.py:383-385
+            for file in fileList:
+                with open(os.path.join(dir, file), 'r') as f:
+                    dir_data[os.path.relpath(file, path)] = f.readlines()
----------------
zturner wrote:
> Would it be more efficient to just save off the paths instead of reading the lines up front?  You can imagine a case where a folder has a ton of files, but the other folder has none of those files.  Reading all those lines up front and making a strong reference to them is going to consume a huge amount of memory.  But you might not even need the contents of the file because the file might not exist in the other folder anyway, in which case you can early out on it.  It seems like all you need to do is build up a tuple like `(dirname, child_dirs, files)`, where `child_dirs` is itself a list of tuples of the exact same form.  But don't read the contents until you know you have to do the comparison, and throw the contents away before you do the next comparison, to guarantee your'e not loading an entire directory tree's worth of file contents into memory at once.
Agreed. Implemented a solution that does that and follows behavior of GNU diffutils better. I'll do a cleanup a bit later and ask for another round of review :)


================
Comment at: utils/lit/lit/TestRunner.py:431-434
+                for filename in filenames_in_dirs[0] - filenames_in_dirs[1]:
+                    stdout.write("Only in %s: %s" % (filepaths[0], filename))
+                for filename in filenames_in_dirs[1] - filenames_in_dirs[0]:
+                    stdout.write("Only in %s: %s" % (filepaths[1], filename))
----------------
zturner wrote:
> This looks a little strange to me.  `stdout.write` doesn't add a newline does it?  Wouldn't this mash everything together on a single line?  Any reason not to use `print` here?
Good catch, `print` sounds good. Thanks!


https://reviews.llvm.org/D41776





More information about the llvm-commits mailing list