[PATCH] D105759: Implement P2361 Unevaluated string literals

Mon Jun 26 10:30:44 PDT 2023

cor3ntin marked 6 inline comments as done.
cor3ntin added inline comments.

================
Comment at: clang/include/clang/Basic/Attr.td:1411
   let Documentation = [DeprecatedDocs];
+  let ParseArgumentsAsUnevaluated = 1;
 }
----------------
aaron.ballman wrote:
> What is the plan for non-standard attributes? Are you planning to handle those in a follow-up, or should we be investigating those right now?
I don't feel I'm qualified to answer that. Ideally, attributes that expect string literals that are not evaluated should follow suite.

================
Comment at: clang/include/clang/Parse/Parser.h:1857-1859
                            bool FailImmediatelyOnInvalidExpr = false,
-                           bool EarlyTypoCorrection = false);
+                           bool EarlyTypoCorrection = false,
+                           bool AllowEvaluatedString = true);
----------------
aaron.ballman wrote:
> Two default `bool` params is a bad thing but three default `bool` params seems like we should fix the interface at this point. WDYT?
> 
> Also, it's not clear what the new parameter will do, the function could use comments unless fixing the interface makes it sufficiently clear.
I'm still not sure that's the best solution. `AllowEvaluatedString` would only ever be false for attributes, I consider duplicating the function, except it does quite a bit for variadics, which apparently attribute support

Maybe would could have

```
ParseAttributeArgumentList
ParseExpressionList
ParseExpressionListImpl?
```
?

================
Comment at: clang/lib/AST/Expr.cpp:1165
+    unsigned CharByteWidth = mapCharByteWidth(Ctx.getTargetInfo(), Kind);
+    unsigned ByteLength = Str.size();
+    assert((ByteLength % CharByteWidth == 0) &&
----------------
cor3ntin wrote:
> aaron.ballman wrote:
> > shafik wrote:
> > > Isn't this the same as `Length`?
> > It is -- I think we can get rid of `ByteLength`, but it's possible that this exists because of the optimization comment below. I don't insist, but it would be nice to know if we can replace the switch with `Length /= CharByteWidth` these days.
> Only when CharByteWidth == 1
I think we should.

================
Comment at: clang/lib/Lex/LiteralSupport.cpp:98
+  case 't':
+  case 'r':
+    return true;
----------------
aaron.ballman wrote:
> We're still missing support for some escape characters from: http://eel.is/c++draft/lex#nt:simple-escape-sequence-char
> 
> Just to verify, UCNs have already been handled by the time we get here, so we don't need to care about those, correct?
> Just to verify, UCNs have already been handled by the time we get here, so we don't need to care about those, correct?

They are dealt with elsewhere yes (and supported)

================
Comment at: clang/lib/Lex/LiteralSupport.cpp:2080-2082
+      if (!isUnevaluated() && Features.PascalStrings &&
+          ThisTokBuf + 1 != ThisTokEnd && ThisTokBuf[0] == '\\' &&
+          ThisTokBuf[1] == 'p') {
----------------
aaron.ballman wrote:
> Is there test coverage that we diagnose this properly?
What sort of test would you like to see?

================
Comment at: clang/lib/Parse/ParseExpr.cpp:3501-3503
+    } else if (!AllowEvaluatedString && tok::isStringLiteral(Tok.getKind())) {
+      Expr = ParseUnevaluatedStringLiteralExpression();
+    } else {
----------------
aaron.ballman wrote:
> I'm surprised we need special logic in `ParseExpressionList()` for handling unevaluated string literals; I would have expected that to be needed when parsing a string literal. Nothing changed in the grammar for http://eel.is/c++draft/expr.post.general#nt:expression-list (or initializer-list), so these changes seem wrong. Can you explain the changes a bit more?
We use `ParseExpressionList` when parsing attribute arguments, and some attributes have unevaluate string as argument - I agree with you that I'd rather find a better solution for attributes, but I came up empty. There is no further reason for this change, and you are right it does not match the grammar.

================
Comment at: clang/lib/Sema/SemaDeclAttr.cpp:878-879

-  if (!isIntOrBool(AL.getArgAsExpr(0))) {
+  Expr *First = AL.getArgAsExpr(0);
+  if (isa<StringLiteral>(First) || !isIntOrBool(First)) {
     S.Diag(AL.getLoc(), diag::err_attribute_argument_n_type)
----------------
aaron.ballman wrote:
> Test coverage for these changes?
There is one somewhere, I don;t remember where, The reason we need to do that is that Unevaluated StringLiterals don''t have types

================
Comment at: clang/lib/Sema/SemaDeclCXX.cpp:16470
   StringLiteral *Lit = cast<StringLiteral>(LangStr);
-  if (!Lit->isOrdinary()) {
-    Diag(LangStr->getExprLoc(), diag::err_language_linkage_spec_not_ascii)
-      << LangStr->getSourceRange();
-    return nullptr;
-  }
+  assert(Lit->isUnevaluated() && "Unexpected string literal kind");

----------------
aaron.ballman wrote:
> Test coverage for changes?
There are some in dcl.link/p2.cpp

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105759/new/

https://reviews.llvm.org/D105759