[PATCH] D43165: [lit] Fix problem in how Python versions open files with different encodings

Dan Liew via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 21 01:09:16 PST 2018


delcypher added a comment.

In https://reviews.llvm.org/D43165#1012857, @MatzeB wrote:

> Reading python docu and playing a bit with a prototype I believe this is the way to do things:
>
>   import sys
>   import difflib
>   f = open(sys.argv[1], 'rb').readlines()
>   f2 = open(sys.argv[2], 'rb').readlines()
>   if hasattr(difflib, 'diff_bytes'):
>       # python 3.5 or newer
>       gen = difflib.diff_bytes(difflib.unified_diff, f, f2)
>       sys.stdout.buffer.writelines(gen)
>   else:
>       # python 2.7
>       gen = difflib.unified_diff(f, f2)
>       sys.stdout.writelines(gen)
>
>
> (python 3.0-3.4 difflib before diff_bytes appears to be broken indeed for inconsistent/invalid encoded inputs, but I hope we can just ignore old 3.x versions...)


Given that you're aware of this issue I don't think we should ignore this problem.  I don't want weird diff failures because a machine I happen to be using is using a old Python 3 version. This would be a serious PITA to debug.

If python >= 3.5 is required when using python 3 this seems like a reasonable requirement given that this only matters for the internal shell which is used on Windows (where no python is shipped by default so upgrading python should be fairly painless).

If we're doing a binary diff on Python 3.0-3.4 I think we should just emit an error explaining the problem and fail the currently running test.


Repository:
  rL LLVM

https://reviews.llvm.org/D43165





More information about the llvm-commits mailing list