[PATCH] D93031: Enable fexec-charset option

Abhina Sree via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 4 06:20:34 PST 2021


abhina.sreeskantharajan marked 4 inline comments as done.
abhina.sreeskantharajan added inline comments.


================
Comment at: clang/include/clang/Lex/Preprocessor.h:145
   ModuleLoader      &TheModuleLoader;
+  LiteralConverter  LT;
 
----------------
rsmith wrote:
> Please give this a longer name. Abbreviation names should only be used in fairly small scopes where it's easy to look up what they refer to.
> 
> Also: why `LT`? What does the `T` stand for?
Thanks for catching this. This was a change I missed when renaming LiteralTranslator to LiteralConverter. I've added a longer name.


================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:6178
+  CmdArgs.push_back(Args.MakeArgString(Triple.getSystemCharset()));
+  for (auto it = vList.begin(), ie = vList.end(); it != ie; ++it) {
+    llvm::ErrorOr<llvm::CharSetConverter> ErrorOrConverter =
----------------
rsmith wrote:
> Looping over all the arguments is a little unusual. Normally we'd get the last argument value and only check that one. Do you need to pass more than one value onto the frontend?
Thanks, I've changed it back to get the LastArg only and use the spelling of the argument to fix the diagnostic error message in the driver lit tests.


================
Comment at: clang/lib/Lex/LiteralSupport.cpp:1363-1364
+          if (!HadError && Converter) {
+            assert(Kind != tok::wide_char_constant &&
+                   "Wide character translation not supported");
+            SmallString<1> ConvertedChar;
----------------
rsmith wrote:
> Why is this case not possible?
This case should be handled when fwide-exec-charset option is implemented. Until then, we thought it was best to emit a error message that wide literal translation is not supported.


================
Comment at: clang/lib/Lex/LiteralSupport.cpp:234
+    SmallString<8> ResultCharConv;
+    Converter->convert(std::string(1, ByteChar), ResultCharConv);
+    memcpy((void *)&ResultChar, ResultCharConv.data(), sizeof(unsigned));
----------------
rsmith wrote:
> abhina.sreeskantharajan wrote:
> > tahonermann wrote:
> > > Conversion can fail here, particularly in the scenario corresponding to the default switch case above; `ResultChar` could contain, for example, a lead byte of a UTF-8 sequence.  Something sensible should be done here; either rejecting the code with an error or substituting `?` (in the execution encoding) seems appropriate to me.
> > Thanks, I added the substitution with the '?' character for invalid escapes.
> This is a regression. Our prior behavior for unknown escapes was to leave the character alone. We should still do that wherever possible -- eg, `\q` should produce `q` -- and take fallback action only if the character is unencodable. Producing a `?` seems unlikely to ever be what anyone wants; producing a hard error would seem preferable.
Hi @tahonermann, do you also agree we should use the original behaviour or give a hard error instead?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93031/new/

https://reviews.llvm.org/D93031



More information about the cfe-commits mailing list