[PATCH] D46238: [llvm-rc] Add rudimentary support for codepages

Adrian McCarthy via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 30 11:53:19 PDT 2018


amccarth added inline comments.


================
Comment at: tools/llvm-rc/ResourceFileWriter.h:33
+  CP_WIN_1252 = 1252, // A codepage where all 8 bit values correspond to
+                      // unicode code points with the same value.
+  CP_UTF8 = 65001,    // UTF-8.
----------------
mstorsjo wrote:
> amccarth wrote:
> > That's not _strictly_ true.  Windows-1252 maps many characters in the range 0x80-0x9F that do not correspond to the "upper" control characters Unicode has at the corresponding code points.  For example, Windows-1252 0x83 corresponds to Unicode U+0192 (LATIN SMALL LETTER F WITH HOOK).
> > 
> > It's probably true enough for practical purposes, but I'd be disappointed if this comment leads readers to not appreciate the differences.
> > 
> > Also, s/8 bit values/8-bit values/.
> > 
> > 
> Oh, thanks for pointing this out!
> 
> Is there any other codepage I should rather pick for the same purpose - trivial implementation in converting to unicode without external dependencies? Win-28591 aka latin1?
Yes, the Latin-1 page would work as you intended and require no conversion.  That said, it's probably pretty uncommon in practice.


Repository:
  rL LLVM

https://reviews.llvm.org/D46238





More information about the llvm-commits mailing list