[cfe-dev] UTF-8 vs. UTF-16 code locations

Joachim Durchholz via cfe-dev cfe-dev at lists.llvm.org
Mon Jan 25 11:39:44 PST 2016


Am 25.01.2016 um 19:23 schrieb Milian Wolff via cfe-dev:
> This is done without ever loading any file in an editor. But we do run a lot
> of clang_parseTranslationUnit2 calls which will internally open files from
> disk. Then we visit the AST and get e.g. the position for a class declaration.
> In order to convert that position, assuming the file is UTF-8 encoded, I want
> to translate it to a UTF-16 position.

Can't you convert to UTF-16 during load? Then you don't need to 
translate at all.
I'm under the impression that you are keeping an UTF-8 data blob in an 
environment that mostly talks UTF-16; in that case, the cleanest 
solution would be to have the data blob in UTF-16, too. Of course I 
don't know how much of your code base you'd have to touch to change 
that, this could be quite nasty or surprisingly easy.



More information about the cfe-dev mailing list