[lldb-dev] [Bug 33250] New: Issue with utf8 string display in the ncurses-GUI mode on a Beaglebone Black (ARMv6)

via lldb-dev lldb-dev at lists.llvm.org
Wed May 31 16:04:10 PDT 2017


https://bugs.llvm.org/show_bug.cgi?id=33250

            Bug ID: 33250
           Summary: Issue with utf8 string display in the ncurses-GUI mode
                    on a Beaglebone Black (ARMv6)
           Product: lldb
           Version: unspecified
          Hardware: Other
                OS: FreeBSD
            Status: NEW
          Severity: normal
          Priority: P
         Component: All Bugs
          Assignee: lldb-dev at lists.llvm.org
          Reporter: rj at obsigna.com
                CC: llvm-bugs at lists.llvm.org

I built (lldb + clang/lld) from the svn trunk of LLVM 5.0.0 on my Beaglebone
Black running the latest snapshot (May 26th) of FreeBSD 12.0-CURRENT.

# lldb --version
lldb version 5.0.0 (http://llvm.org/svn/llvm-project/lldb/trunk revision
304078)
  clang revision 304078
  llvm revision 304078

However, the present issue can be found in lldb at least since 3.8.


First of all, my system's locale is:

LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_ALL=

$echo $TERM
xterm-256color


Now, please consider the following tiny ncurses test program 'cursutf8.c',
which prints out the traditional German pangram for testing the special
characters 'ä', 'ö', 'ü', 'ß', each of which consists of two bytes when encoded
in UTF8:

#include <stdio.h>
#include <locale.h>
#include <curses.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
  setlocale(LC_CTYPE, "");

  WINDOW *window = initscr();
  if (window)
  {
     mvaddstr(3, 3, "Zwölf Boxkämpfer jagen Viktor quer über den großen Sylter
Deich.");
     refresh();
     sleep(3);

     delwin(window);
     endwin();
     refresh();

     return 0;
  }

  else
     return 1;
}


I compile this using:
  $clang -g -O0 cursutf8.c -lncursesw -o cursutf8


When I run it, it correctly prints out:
  Zwölf Boxkämpfer jagen Viktor quer über den großen Sylter Deich.


Then I start this with lldb:

$lldb -- cursutf8

(lldb) breakpoint set -f cursutf8.c -l 13
(lldb) run

Process 13886 stopped
* thread #1, name = 'cursutf8', stop reason = breakpoint 1.1
   frame #0: 0x00008a08 cursutf8`main(argc=1, argv=0xbfbfec74) at cursutf8.c:13
  10       WINDOW *window = initscr();
  11       if (window)
  12       {
-> 13         mvaddstr(3, 3, "Zwölf Boxkämpfer jagen Viktor quer über den
großen Sylter Deich.");
  14          refresh();
  15          sleep(3);
  16    

So far this is OK as well.


The issue shows up, when I enter into the GUI mode:

(lldb) gui

│ 10 │    WINDOW *window = initscr();                                           
│ 11 │    if (window)                                                           
│ 12 │    {                                                                     
│ 13 │◆      mvaddstr(3, 3, "ZwM-CM-6lf BoxkM-CM-$mpfer jagen Viktor quer
M-CM-<ber den groM-C~_en Sylter Deich."); 
│ 14 │       refresh();                                                         
│ 15 │       sleep(3);                                                          
│ 16 │


CONCLUSION:

Within LLDB on x86, utf8 text is displayed well in CLI and in GUI mode.

Within LLDB on ARMv6, utf8 text is displayed correctly in CLI but NOT in
curses-GUI mode.

A simple ncurses program as such, can handle utf8 text correctly on ARM and
x86.

Best regards

Rolf

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20170531/c02e59bc/attachment.html>


More information about the lldb-dev mailing list