[lldb-dev] [llvm-dev] Adding DWARF5 accelerator table support to llvm

Tue Jan 30 07:41:04 PST 2018

> On Jan 30, 2018, at 7:35 AM, Pavel Labath <labath at google.com> wrote:
> 
> Hello all,
> 
> I am looking for feedback regarding implementation of the case folding
> algorithm for .debug_names hashes.
> 
> Unlike the apple tables, the .debug_names hashes are computed from
> case-folded names (to enable case-insensitive lookups for languages
> where that makes sense). The dwarf5 document specifies that the case
> folding should be done according the the "Caseless matching" Section
> of the Unicode standard (whose implementation is basically a long list
> of special cases). While certainly possible, implementing this would
> be much more complicated (and would probably make the code a bit
> slower) than a simple tolower(3) call. And the benefits of this are
> not really clear to me.

Assuming a UTF-8 encoding, will tolower(3) destroy any non-ASCII characters in the process? In Swift, for example, we allow a wide range of unicode characters in identifiers and I want to make sure that this doesn't cause any problems.

-- adrian
> 
> Do you know if we already make any promises or assumptions about the
> encoding and/or locale of the symbol names (and here I mainly mean the
> names in the debug info metadata, not llvm symbols).
> 
> If we don't already have a policy about this, then I propose to
> implement the case folding via tolower() (which is compatible with the
> full case folding algorithm, as long as one sticks to basic latin
> characters).
> 
> What do you think?