[lldb-dev] Editline Rewrite : issues surround wide character handling on different platforms

Shawn Best sbest at blueshiftinc.com
Tue Oct 28 12:42:56 PDT 2014


There is a significant Editline rewrite, adding a bunch of 
improvements.  It has been well tested on OSx, but not yet upstreamed.  
I have spent some time reviewing the proposed patch and working through 
issues to get it running on linux.  To see the patch and accompanying 
discussion, refer to: http://reviews.llvm.org/D5835    The main issues 
that came up are related to handling wide characters and differences 
between platforms.

Internally, lldb uses std::string which is an array of 8 bit chars, that 
can either be 7 bit ascii, or utf8 encoded wide characters. Libedit uses 
either char, or wchar_t which is a 32 bit char on linux.

<codecvt> : the patch uses a c++ 11 class std::codecvt_utf8, this is a 
facet implementation that will do utf8 to wchar convervsion.  It is part 
of c++ 11 standard, but not yet supported in gcc.  I can use #ifdef to 
temporarily write equivalent functionality in that case while we wait 
for gcc to catch up.

libedit : libedit is a prerequisite that a new linux/lldb user installs 
( sudo apt-get install libedit-dev ).  A few years ago, libedit added 
versions of its functions that work on wchar_t. Unfortunately, this 
option is not built by default, and not present in the Ubuntu 
distribution.  To get around this, I see a few options:

- take libedit source files (or subset) and add to the lldb project.  We 
could either build a .so file, or just statically link the .cpp files.

- rework the Editline rewrite, so it either uses standard 8 bit chars, 
or wchar_t/utf8 depending on the platform.  This would be conditionally 
built depending on the platform.

- modify Ubuntu, so 'sudo apt-get install dev-libedit' installs a 
version that has been built with wide character support enabled.

- introduce custom step for new linux lldb users, where they download 
libedit source and build and install a wchar version

The last 2 options don't seem that great.

I expect there will be problems on Windows, which I think uses utf16 
coding.  The file EditLineWin.cpp, contains prototypes for most of the 
structures and functions needed, but they look stubbed out.

Any thoughts?

Shawn.



More information about the lldb-dev mailing list