[PATCH] D67643: [lit] Extend internal diff to support `-` argument

Reid Kleckner via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 8 14:33:53 PDT 2019


rnk accepted this revision.
rnk added a comment.
This revision is now accepted and ready to land.

lgtm



================
Comment at: llvm/utils/lit/lit/builtin_commands/diff.py:27-29
+    # FIXME: How can we restart stdin if the encoding is wrong?  How can we
+    # read stdin with a different encoding or in binary mode in a way that's
+    # compatible with python 2 and 3?
----------------
jdenny wrote:
> jdenny wrote:
> > rnk wrote:
> > > This function seems fragile to me. It reads each file three times:
> > > 1. without an encoding (system default?), may fail
> > > 2. with a utf-8 encoding, may fail
> > > 3. inside compareTwo*Files, reads them yet again
> > > 
> > > I think the best way to be portable between Python 2 & 3 will probably be to work in bytes as much as possible.
> > > 
> > > I think you can use this pattern to get a new file descriptor for stdin that reads bytes:
> > > ```
> > > fileno = sys.stdin.fileno()
> > > if fileno is not None:
> > >     new_stdin = os.fdopen(os.dup(fileno), 'rb') # read in binary
> > > ```
> > > 
> > > Here's how I would do it:
> > > 1. read both inputs completely in binary (can't fail)
> > > 2. try decoding the entire file in a few encodings (try default, try utf-8, if that fails, stick with bytes and diff_bytes)
> > > 3. split file on u'\n' or b'\n' as appropriate, perhaps stripping trailing '\r' if present, depending on flags
> > > 
> > > Does that seem reasonable?
> > I don't know the motivation by the original implementation.  Your plan sounds better.  Sorry, but I don't have time to implement and test this right now.  Hopefully soon.  Thanks for the review.
> > I think you can use this pattern to get a new file descriptor for stdin that reads bytes:
> > 
> > ```
> > fileno = sys.stdin.fileno()
> > if fileno is not None:
> > ```
> 
> I don't see documentation saying that `sys.stdin.fileno` might return `None`.  Is that possible?
I think I just copied it from stack overflow here: https://stackoverflow.com/questions/32199552/what-is-sys-stdin-fileno-in-python

I guess in this case os.dup would fail, which is a reasonable failure mode.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67643/new/

https://reviews.llvm.org/D67643





More information about the llvm-commits mailing list