[PATCH] D153914: [clang-cl] Enable concatenation of predefined identifiers

Richard Dzenis via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Jul 21 16:14:04 PDT 2023


RIscRIpt added a comment.

Thanks to your suggestion of testing different types, I realized clang does not support MSVC macros with `u`, `u8`, and `U` prefixes in addition to `L__FUNCDNAME`. By the way, clang replicates MSVC behavior a little bit incorrectly: `L__FUNCTION__` is not a valid token for MSVC, however it becomes valid if used via macro concatenation (see `#define WIDE`).
I think addition of support of new macros as well as bug fix (if possible) should be submitted separately. I'll open-up an issue, and submit a PR later.



================
Comment at: clang/lib/Sema/SemaExpr.cpp:2010-2013
+    OS << "R\"EFLPM("
+       << PredefinedExpr::ComputeName(getPredefinedExprKind(Tok.getKind()),
+                                      currentDecl)
+       << ")EFLPM\"";
----------------
tahonermann wrote:
> "EFLPM"? I'm assuming this is an attempt to avoid an accidental clash with the computed name. Does this suffice to ensure a clash can't happen? If not, can we do better? Per http://eel.is/c++draft/lex.string#nt:r-char and http://eel.is/c++draft/lex.charset#1, we have a lot of flexibility regarding which characters to use.
At first I thought you were hinting me to use non-ascii characters for d-char-sequence. However, when I looked closely d-char-sequence is not as flexible as r-chars that you referenced: http://eel.is/c++draft/lex.string#nt:d-char and http://eel.is/c++draft/lex.charset#2

Here're my thoughts:
1. I was afraid that there could be a possibility of a function name contain either quote (`"`) or a backslash (`\`) despite it seemed impossible.
2. At first I used `Lexer::Stringify` to make it bullet-proof. I didn't like it, neither did you (reviewers).
3. I replaced `Lexer::Stringify` with raw-string-literal (d-char-sequence was chosen as acronym of current function name).
4. Current `EFLPM` looks weird and cryptic. And we are limited to [lex.charset.basic] with exceptions (space and earlier chars are not allowed). I think, using a raw-string-literal without d-char-sequence would be enough, because problem could be caused only by two consecutive characters `)"`, neither of which can appear in a function name.


================
Comment at: clang/test/Sema/ms_predefined_expr.cpp:115
+                                                                                                       // expected-warning{{string literal concatenation of predefined identifier 'L__FUNCSIG__' is a Microsoft extension}}
+}
+
----------------
tahonermann wrote:
> How about testing some other encoding prefix combinations?
>   u"" __FUNCTION // Ok; UTF-16 string literal
>   u"" L__FUNCTION__ // ill-formed.
> 
Good idea. I am not sure whether I should specify `-std` in the test command. I'll leave it without, because current default standard is C++17, and I believe it's not going to be decreased.

And I think there are enough tests of checking values of these macros. So, in tests for encoding I am going to check types.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153914/new/

https://reviews.llvm.org/D153914



More information about the cfe-commits mailing list