[PATCH] D90116: [clangd] Escape Unicode characters to fix Windows builds

Daniel Martín via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sun Oct 25 13:51:46 PDT 2020


danielmartin added a comment.

Using explicit UTF-8 string literals is a possible solution, but it makes the code a bit less readable. Another possible solution is to save the source file using UTF-8 with BOM, but this is confusing outside the Microsoft world (and it's very easy to remove the BOM by mistake).

I think passing `/utf-8` to MSVC is the best solution for good interoperability with Clang.

> I am not sure if there's a way to change that for clang/gcc. I I believe they both require plain ascii or utf-8 anyways.

Clang enforces UTF-8 everywhere so there's no need for additional configuration. Clang can also accept source files encoded in UTF-8 with BOM. I'm not sure about GCC, I think you need to enforce the encoding manually like in MSVC (see the `-finput-charset` and `-fexec-charset` options).

I recommend reading this article about how MSVC interprets the encoding of source files, the casuistic is a bit complex: https://devblogs.microsoft.com/cppblog/new-options-for-managing-character-sets-in-the-microsoft-cc-compiler/


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90116/new/

https://reviews.llvm.org/D90116



More information about the cfe-commits mailing list