[PATCH] D127363: [Lex] Fix for char32_t literal truncation on 16 bit architectures

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Jun 20 06:20:45 PDT 2022


sammccall added inline comments.


================
Comment at: clang/lib/Lex/LiteralSupport.cpp:1600
 
-  llvm::APInt LitVal(PP.getTargetInfo().getIntWidth(), 0);
+  llvm::APInt LitVal(PP.getTargetInfo().getChar32Width(), 0);
 
----------------
tahonermann wrote:
> I don't think this is quite right. For the code that follows this change to work as intended and issue the "Character constant too long for its type" diagnostic, the width needs to match that of `int`. This is required for multicharacter literals (they have type `int`) so that an appropriate diagnostic is issued for `'xxxxx'` for targets that have a 32-bit int (or for `'xxx'` for targets that have a 16-bit int)`.
> 
> Additionally, the type of a character constant in C is `int`.
> 
> I think what is needed is something like:
>   unsigned BitWidth = getCharWidth(Kind, PP.getTargetInfo());
>   if (IsMultiChar || !PP.getLangOpts().CPlusPlus)
>     BitWidth = PP.getTargetInfo().getIntWidth();
>   llvm::APInt LitVal(BitWidth, 0);
Thanks for pointing this out.

My reading of https://eel.is/c++draft/lex.ccon#2 is that a multi-char char literal with a L/u8/u/U prefix is not `int` but the respective character types, so the conditions here are even a little *more* complicated than you suggest :-(


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127363/new/

https://reviews.llvm.org/D127363



More information about the cfe-commits mailing list