[PATCH] D153914: [clang-cl] Enable concatenation of predefined identifiers

Richard Dzenis via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Jul 14 17:09:23 PDT 2023


RIscRIpt added inline comments.


================
Comment at: clang/include/clang/Basic/TokenKinds.h:87-93
+/// Return true if this token is a predefined macro
+/// unexpandable by MSVC preprocessor.
+inline bool isUnexpandableMsMacro(TokenKind K) {
+  return K == tok::kw___FUNCTION__ || K == tok::kw___FUNCSIG__ ||
+         K == tok::kw_L__FUNCTION__ || K == tok::kw_L__FUNCSIG__ ||
+         K == tok::kw___FUNCDNAME__;
+}
----------------
tahonermann wrote:
> 
Thanks, I like the name. But shouldn't we reflect that we are referring to only Microsoft (unexpandable) macros? How about `isFunctionLocalPredefinedMsMacro`?


================
Comment at: clang/include/clang/Parse/Parser.h:578-582
+  bool isTokenConcatenable() const {
+    return isTokenStringLiteral() ||
+           getLangOpts().MicrosoftExt &&
+               tok::isMSPredefinedMacro(Tok.getKind());
+  }
----------------
tahonermann wrote:
> RIscRIpt wrote:
> > cor3ntin wrote:
> > > Unless you find a better name,  I think it's preferable to keep `isTokenConcatenable` and `isTokenPredefinedMsMacro` as separate functions.
> > > Also, this seems like a weird place to check for  `getLangOpts().MicrosoftExt`
> > Regarding `getLangOpts().MicrosoftExt`. If you are talking about it's presence in a function which name is meant to be used as a predicate, I agree. If you are talking about `class Parser`, then there're other places with references to `getLangOpts()`.
> > 
> > Without such function `ParseStringLiteralExpression` implementation would be too verbose.
> > Let's decide what we can do after I address other comments.
> > 
> > Meanwhile, I renamed it to `isTokenLikeStringLiteral()`.
> I suggest passing `getLangOpts()` to `isFunctionLocalPredefinedMacro` (`isUnexpandableMsMacro`) and letting it determine whether the token is or is not one of these special not-really-a-string-literal macro things. This will facilitate additional such macros controlled by different options while also colocating the option check with the related tokens.
@tahonermann, in theory it sounds good, but if I understood you correctly, the implementation seem weird to me:
```
inline bool isFunctionLocalPredefinedMsMacro(TokenKind K, const LangOptions& langOpts) {
  if (!langOpts.MicrosoftExt)
    return false;
  return K == tok::kw___FUNCTION__ || K == tok::kw___FUNCSIG__ ||
         K == tok::kw_L__FUNCTION__ || K == tok::kw_L__FUNCSIG__ ||
         K == tok::kw___FUNCDNAME__;
}
```
And I would have to add `#include "LangOptions.h"` in `TokenKinds.h`.


================
Comment at: clang/include/clang/Parse/Parser.h:578-582
+  bool isTokenConcatenable() const {
+    return isTokenStringLiteral() ||
+           getLangOpts().MicrosoftExt &&
+               tok::isMSPredefinedMacro(Tok.getKind());
+  }
----------------
RIscRIpt wrote:
> tahonermann wrote:
> > RIscRIpt wrote:
> > > cor3ntin wrote:
> > > > Unless you find a better name,  I think it's preferable to keep `isTokenConcatenable` and `isTokenPredefinedMsMacro` as separate functions.
> > > > Also, this seems like a weird place to check for  `getLangOpts().MicrosoftExt`
> > > Regarding `getLangOpts().MicrosoftExt`. If you are talking about it's presence in a function which name is meant to be used as a predicate, I agree. If you are talking about `class Parser`, then there're other places with references to `getLangOpts()`.
> > > 
> > > Without such function `ParseStringLiteralExpression` implementation would be too verbose.
> > > Let's decide what we can do after I address other comments.
> > > 
> > > Meanwhile, I renamed it to `isTokenLikeStringLiteral()`.
> > I suggest passing `getLangOpts()` to `isFunctionLocalPredefinedMacro` (`isUnexpandableMsMacro`) and letting it determine whether the token is or is not one of these special not-really-a-string-literal macro things. This will facilitate additional such macros controlled by different options while also colocating the option check with the related tokens.
> @tahonermann, in theory it sounds good, but if I understood you correctly, the implementation seem weird to me:
> ```
> inline bool isFunctionLocalPredefinedMsMacro(TokenKind K, const LangOptions& langOpts) {
>   if (!langOpts.MicrosoftExt)
>     return false;
>   return K == tok::kw___FUNCTION__ || K == tok::kw___FUNCSIG__ ||
>          K == tok::kw_L__FUNCTION__ || K == tok::kw_L__FUNCSIG__ ||
>          K == tok::kw___FUNCDNAME__;
> }
> ```
> And I would have to add `#include "LangOptions.h"` in `TokenKinds.h`.
If you are still concerned about usage of LangOptions in this context, and at the same time you are not against of having such function, I can offer the following alternatives:
1. Make it a local lambda function in `ParseStringLiteralExpression`
2. Make it a TU local function in `ParseExpr.cpp`

I went with (2) to do simplification of condition in switch statement below. Marking thread resolved, as it's no longer relevant.


================
Comment at: clang/include/clang/Sema/Sema.h:5681
 
+  std::vector<Token> ExpandUnexpandableMsMacros(ArrayRef<Token> Toks);
   ExprResult BuildPredefinedExpr(SourceLocation Loc,
----------------
tahonermann wrote:
> 
Renamed to `ExpandFunctionLocalPredefinedMsMacros`. If you think my addition of `Ms` is redundant, let me know.


================
Comment at: clang/lib/Parse/ParseExpr.cpp:1300-1307
+    if (!getLangOpts().MicrosoftExt || !tok::isUnexpandableMsMacro(SavedKind) ||
+        !tok::isUnexpandableMsMacro(NextToken().getKind()) &&
+            !tok::isStringLiteral(NextToken().getKind())) {
+      Res = Actions.ActOnPredefinedExpr(Tok.getLocation(), SavedKind);
+      ConsumeToken();
+      break;
+    }
----------------
tahonermann wrote:
> Since the conditional code is only relevant for some of the tokens here, I would prefer to see the other token kinds (`kw___func__`, `kw___PRETTY_FUNCTION__`) handled separately to avoid the conditional fallthrough. That means duplicating the calls to `Actions.ActOnPredefinedExpr()` and `ConsumeToken()` between the blocks, but that bothers me less than having to understand the complicated condition.
That's a valid point. And by handling two tokens separately allows simplification of the condition. Adjusted code.


================
Comment at: clang/lib/Parse/ParseExpr.cpp:3277-3280
+  // String concatenation.
+  // Note that keywords like __func__ and __FUNCTION__
+  // are not considered to be strings for concatenation purposes,
+  // unless Microsoft extensions are enabled.
----------------
tahonermann wrote:
> I think `__func__` is not considered a string literal in Microsoft mode, so I think this comment needs some adjusting.
Right, removed mention of `__func__`.


================
Comment at: clang/lib/Parse/ParseExpr.cpp:3288-3291
+  assert(
+      (StringToks.size() != 1 ||
+       !tok::isUnexpandableMsMacro(StringToks[0].getKind())) &&
+      "single predefined identifiers shall be handled by ActOnPredefinedExpr");
----------------
tahonermann wrote:
> What is the reason for requiring non-concatenated predefined function local macros to be handled by `ActOnPredefinedExpr()`?
To ensure generation of the same AST as before my changes (with `PredefinedExpr`):

```
|   | `-VarDecl <col:2, col:19> col:13 a 'const char[2]' cinit
|   |   `-PredefinedExpr <col:19> 'const char[2]' __FUNCTION__
|   |     `-StringLiteral <col:19> 'const char[2]' "f"
```

This actually brings a question: do we want to wrap StringLiterals (into PredefinedExpr) which were formed from concatenation of string-literals and some predefined identifier(s)? But it does not really make any sense: concatenated string does not represent any of PredefinedExpr.


================
Comment at: clang/lib/Sema/Sema.cpp:1494
 
+Decl *Sema::getCurScopeDecl() {
+  if (const BlockScopeInfo *BSI = getCurBlock())
----------------
tahonermann wrote:
> `getCurScopeDecl` isn't a great name since this doesn't handle class or namespace scopes. How about `getCurLocalScopeDecl`? That name isn't quite right either since this can return a `TranslationUnitDecl`, but I think it matches the intent fairly well.
> 
> None of the callers need the actual `TranslationUnitDecl` that is returned. I think this function can return `nullptr` is the current scope isn't function local and callers can issue the relevant diagnostic in that case (thus avoiding the need to check that a translation unit decl was returned).
Sounds good to me. The only alternative I can come up with is `getCurFunctionScopeDecl`, but is it any better?

Adjusted code as per your suggestions.


================
Comment at: clang/lib/Sema/SemaExpr.cpp:1971-1974
+  // MSVC treats some of predefined identifiers (e.g. __FUNCTION__) as
+  // expandable macros defined as string literals, which may be concatenated.
+  // Expand them here (in Sema), because StringLiteralParser (in Lex)
+  // doesn't have access to AST.
----------------
tahonermann wrote:
> 
Changed, thank you!


================
Comment at: clang/lib/Sema/SemaExpr.cpp:2001-2004
+    OS << '"'
+       << Lexer::Stringify(PredefinedExpr::ComputeName(
+              getPredefinedExprKind(Tok.getKind()), currentDecl))
+       << '"';
----------------
tahonermann wrote:
> A diagnostic is issued if the call to `getCurScopeDecl()` returned a translation unit declaration (at least in Microsoft mode). Does it make sense to pass a pointer to a `TranslationUnitDecl` here?
Shortly, yes, kind of. `ComputeName` can handle `TranslationUnitDecl` in which case it returns an empty string. This way we implicitly (without additional code) create empty string-literal Tokens which can be handled in `StringLiteralParser`. And we cannot pass non-string-literal tokens into `StringLiteralParser`.


================
Comment at: clang/lib/Sema/SemaExpr.cpp:3694
+  Decl *currentDecl = getCurScopeDecl();
+  if (isa<TranslationUnitDecl>(currentDecl)) {
     Diag(Loc, diag::ext_predef_outside_function);
----------------
tahonermann wrote:
> This can just be a `nullptr` check with my suggested change to `getCurScopeDecl()`.
Done, but I had to undo removal of
```
currentDecl = Context.getTranslationUnitDecl();
```
but it is not a big deal.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153914/new/

https://reviews.llvm.org/D153914



More information about the cfe-commits mailing list