[llvm-bugs] [Bug 40765] New: non-ascii source files cannot be shown in scan-view

via llvm-bugs llvm-bugs at lists.llvm.org
Mon Feb 18 11:51:41 PST 2019


https://bugs.llvm.org/show_bug.cgi?id=40765

            Bug ID: 40765
           Summary: non-ascii source files cannot be shown in scan-view
           Product: clang
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Static Analyzer
          Assignee: dcoughlin at apple.com
          Reporter: johannes at sipsolutions.net
                CC: dcoughlin at apple.com, llvm-bugs at lists.llvm.org

We just get:

INTERNAL ERROR

Traceback (most recent call last):
  File "/usr/share/clang/scan-view-9/share/ScanView.py", line 232, in do_GET
    SimpleHTTPRequestHandler.do_GET(self)
  File "/usr/lib/python2.7/SimpleHTTPServer.py", line 45, in do_GET
    f = self.send_head()
  File "/usr/share/clang/scan-view-9/share/ScanView.py", line 712, in send_head
    return self.send_path(path)
  File "/usr/share/clang/scan-view-9/share/ScanView.py", line 727, in send_path
    return self.send_patched_file(path, ctype)
  File "/usr/share/clang/scan-view-9/share/ScanView.py", line 774, in
send_patched_file
    return self.send_string(data, ctype, mtime=fs.st_mtime)
  File "/usr/share/clang/scan-view-9/share/ScanView.py", line 747, in
send_string
    encoded_s = s.encode()
UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 111162:
ordinal not in range(128)


In ScanView.py line 747 we have:

 encoded_s = s.encode()

changing that to just

 encoded_s = s

appears to work around the problem.

It's not clear what _should_ be done about this though. Clearly, C source files
can be any sort of encoding, in particular in comments, and we can't really
know which it is. Most files we have are UTF-8, but some older ones are
ISO-8859-1 or similar encodings, depending on whatever the author wrote ... I
guess ideally it's just passed through more or less, and then worst case some
stuff shows up as garbage in the browser, still better than crashing.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190218/1acc3949/attachment-0001.html>


More information about the llvm-bugs mailing list