[PATCH] D46238: [llvm-rc] Add rudimentary support for codepages

Adrian McCarthy via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 30 10:10:16 PDT 2018


amccarth added inline comments.


================
Comment at: tools/llvm-rc/ResourceFileWriter.cpp:144
+  } else if (CodePage == CP_WIN_1252) {
+    // In CP 1252, all codepoints map straight over to unicode.
+    for (char C : Str)
----------------
s/all/most/


================
Comment at: tools/llvm-rc/ResourceFileWriter.cpp:195
+      } else if (CodePage == CP_WIN_1252) {
+        // In CP 1252, all codepoints map straight over to unicode.
       } else {
----------------
s/all/most/


================
Comment at: tools/llvm-rc/ResourceFileWriter.h:29
+enum CodePage {
+  CP_ACP = 0,         // The current used codepage. Since there's no such
+                      // notion in LLVM what codepage it actually means,
----------------
Are you concerned about the name clash with the macro defined in the Windows headers?  It seems like code that wants to include this header better make sure that it hasn't already included Windows headers (or that it has undefined the macros).


================
Comment at: tools/llvm-rc/ResourceFileWriter.h:33
+  CP_WIN_1252 = 1252, // A codepage where all 8 bit values correspond to
+                      // unicode code points with the same value.
+  CP_UTF8 = 65001,    // UTF-8.
----------------
That's not _strictly_ true.  Windows-1252 maps many characters in the range 0x80-0x9F that do not correspond to the "upper" control characters Unicode has at the corresponding code points.  For example, Windows-1252 0x83 corresponds to Unicode U+0192 (LATIN SMALL LETTER F WITH HOOK).

It's probably true enough for practical purposes, but I'd be disappointed if this comment leads readers to not appreciate the differences.

Also, s/8 bit values/8-bit values/.




Repository:
  rL LLVM

https://reviews.llvm.org/D46238





More information about the llvm-commits mailing list