[Lldb-commits] [PATCH] D16736: Always write the session log file in UTF-8

Todd Fiala via lldb-commits lldb-commits at lists.llvm.org
Sun Jan 31 12:04:50 PST 2016


tfiala added a comment.

> I'm going to have a look at trying this modification now.


I'm getting the same error with the replace.

Here is the patch (okay the whole encoded_file.py) I was able to use to get past this - which ultimately looks to be an error in the match result printing of a raw byte buffer (never meant to be unicode printable) in the result stream (i.e. the error I was getting looked to be purely a by-product of printing match results that succeeded, not failed, but the unicode decoding introduced the actual failure point):

  """
                       The LLVM Compiler Infrastructure
  
  This file is distributed under the University of Illinois Open Source
  License. See LICENSE.TXT for details.
  
  Prepares language bindings for LLDB build process.  Run with --help
  to see a description of the supported command line arguments.
  """
  
  # Python modules:
  import io
  
  # Third party modules
  import six
  
  def _encoded_read(old_read, encoding):
      def impl(size):
          result = old_read(size)
          # If this is Python 2 then we need to convert the resulting `unicode` back
          # into a `str` before returning
          if six.PY2:
              result = result.encode(encoding)
          return result
      return impl
  
  def _encoded_write(old_write, encoding):
      def impl(s):
          # If we were asked to write a `str` (in Py2) or a `bytes` (in Py3) decode it
          # as unicode before attempting to write.
          if isinstance(s, six.binary_type):
              try:
                  s = s.decode(encoding)
              except UnicodeDecodeError as decode_err:
                  import sys
                  sys.stderr.write("error: unicode decode failed on raw string '{}': '{}'".format(s, decode_err))
                  s = u"Could not decode unicode string, see stderr for details"
          return old_write(s)
      return impl
  
  '''
  Create a Text I/O file object that can be written to with either unicode strings or byte strings
  under Python 2 and Python 3, and automatically encodes and decodes as necessary to return the
  native string type for the current Python version
  '''
  def open(file, encoding, mode='r', buffering=-1, errors=None, newline=None, closefd=True):
      wrapped_file = io.open(file, mode=mode, buffering=buffering, encoding=encoding,
                             errors=errors, newline=newline, closefd=closefd)
      new_read = _encoded_read(getattr(wrapped_file, 'read'), encoding)
      new_write = _encoded_write(getattr(wrapped_file, 'write'), encoding)
      setattr(wrapped_file, 'read', new_read)
      setattr(wrapped_file, 'write', new_write)
      return wrapped_file

It just adds a try/except block around the Unicode decode.  Is is highly likely that might not run on Python 3 - i.e. ping pong this back into a Python 3 error.  I may try to bring this up on Windows to see if that does actually happen.

In any event, the right fix here probably is to have displays of matched/expected text for known-to-be binary data *not* try to print results in the expect-string-match code since these are just going to have no way of being valid.  The other way to go (perhaps better) would be to put some kind of safe wrapper around the byte compares, so that they are tested as ASCII-ified output, or use an entirely different mechanism here.

If this works on Windows the way I fixed this up, you can go ahead and check this in.  (I no longer get failures with the code change I made above).  If it doesn't work but you can tweak that slightly, feel free to go ahead and do that as well.  In the meantime I am going to see if I can get the binary aspect of the matching handled properly (i.e. not done as string compares).  This might be one of my tests.  (It's at least in the goop of lldb-server tests that I had written 1.5 to 2 years ago).


http://reviews.llvm.org/D16736





More information about the lldb-commits mailing list