[PATCH] D105759: Implement P2361 Unevaluated string literals

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Jun 26 09:54:27 PDT 2023


aaron.ballman added a subscriber: ChuanqiXu.
aaron.ballman added a comment.

This also should update the cxx_status page and have a release note.



================
Comment at: clang/include/clang/Basic/Attr.td:1411
   let Documentation = [DeprecatedDocs];
+  let ParseArgumentsAsUnevaluated = 1;
 }
----------------
What is the plan for non-standard attributes? Are you planning to handle those in a follow-up, or should we be investigating those right now?


================
Comment at: clang/include/clang/Parse/Parser.h:1857-1859
                            bool FailImmediatelyOnInvalidExpr = false,
-                           bool EarlyTypoCorrection = false);
+                           bool EarlyTypoCorrection = false,
+                           bool AllowEvaluatedString = true);
----------------
Two default `bool` params is a bad thing but three default `bool` params seems like we should fix the interface at this point. WDYT?

Also, it's not clear what the new parameter will do, the function could use comments unless fixing the interface makes it sufficiently clear.


================
Comment at: clang/lib/AST/Expr.cpp:1140
+  case Unevaluated:
+    return sizeof(char); // Host;
+    break;
----------------
shafik wrote:
> Why not grouped w/ `Ordinary` above?
Specifically because we want the host encoding, not the target encoding.


================
Comment at: clang/lib/AST/Expr.cpp:1165
+    unsigned CharByteWidth = mapCharByteWidth(Ctx.getTargetInfo(), Kind);
+    unsigned ByteLength = Str.size();
+    assert((ByteLength % CharByteWidth == 0) &&
----------------
shafik wrote:
> Isn't this the same as `Length`?
It is -- I think we can get rid of `ByteLength`, but it's possible that this exists because of the optimization comment below. I don't insist, but it would be nice to know if we can replace the switch with `Length /= CharByteWidth` these days.


================
Comment at: clang/lib/AST/Expr.cpp:1190
+    StringLiteralBits.CharByteWidth = 1;
+    StringLiteralBits.IsPascal = false;
+  }
----------------
Add `assert(!Pascal && "Can't make an unevaluated Pascal string");` ?


================
Comment at: clang/lib/AST/Expr.cpp:1201
   // Initialize the trailing array of char holding the string data.
-  std::memcpy(getTrailingObjects<char>(), Str.data(), ByteLength);
+  std::memcpy(getTrailingObjects<char>(), Str.data(), Str.size());
 
----------------
shafik wrote:
> Isn't `Str.size()` the same as `ByteLength`? 
I think it's more clear to use `Str.size()` because we're copying from `Str.data()`.


================
Comment at: clang/lib/AST/Expr.cpp:1238
   switch (getKind()) {
+  case Unevaluated: // fallthrough. no prefix.
   case Ordinary:
----------------



================
Comment at: clang/lib/Lex/LiteralSupport.cpp:90
 
+static bool EscapeValidInUnevaluatedStringLiteral(char Escape) {
+  switch (Escape) {
----------------
shafik wrote:
> Should we use `Is` as a prefix here? Right now it should like we are modifying something.
+1, I think `Is` would be an improvement.


================
Comment at: clang/lib/Lex/LiteralSupport.cpp:98
+  case 't':
+  case 'r':
+    return true;
----------------
We're still missing support for some escape characters from: http://eel.is/c++draft/lex#nt:simple-escape-sequence-char

Just to verify, UCNs have already been handled by the time we get here, so we don't need to care about those, correct?


================
Comment at: clang/lib/Lex/LiteralSupport.cpp:1912
+      hadError = true;
+      return;
+    }
----------------
Doesn't returning here leave the object in a partially-initialized state? That seems bad.


================
Comment at: clang/lib/Lex/LiteralSupport.cpp:2080-2082
+      if (!isUnevaluated() && Features.PascalStrings &&
+          ThisTokBuf + 1 != ThisTokEnd && ThisTokBuf[0] == '\\' &&
+          ThisTokBuf[1] == 'p') {
----------------
Is there test coverage that we diagnose this properly?


================
Comment at: clang/lib/Lex/PPMacroExpansion.cpp:1872-1873
     else if (Tok.is(tok::string_literal) && !Tok.hasUDSuffix()) {
-      StringLiteralParser Literal(Tok, *this);
+      StringLiteralParser Literal(Tok, *this,
+                                  StringLiteralEvalMethod::Unevaluated);
       if (Literal.hadError)
----------------
Test coverage for this change?


================
Comment at: clang/lib/Lex/Pragma.cpp:807
   if (Tok.is(tok::string_literal) && !Tok.hasUDSuffix()) {
     StringLiteralParser Literal(Tok, PP);
     if (Literal.hadError)
----------------
aaron.ballman wrote:
> cor3ntin wrote:
> > aaron.ballman wrote:
> > > Should this also be modified?
> > Probably but because I'm not super familiar with module map things I preferred being conservative
> Paging @rsmith for opinions.
> 
> Lacking those opinions, I think being conservative here is fine.
Pinging @ChuanqiXu for opinions.


================
Comment at: clang/lib/Parse/ParseExpr.cpp:3501-3503
+    } else if (!AllowEvaluatedString && tok::isStringLiteral(Tok.getKind())) {
+      Expr = ParseUnevaluatedStringLiteralExpression();
+    } else {
----------------
I'm surprised we need special logic in `ParseExpressionList()` for handling unevaluated string literals; I would have expected that to be needed when parsing a string literal. Nothing changed in the grammar for http://eel.is/c++draft/expr.post.general#nt:expression-list (or initializer-list), so these changes seem wrong. Can you explain the changes a bit more?


================
Comment at: clang/lib/Sema/SemaDeclAttr.cpp:878-879
 
-  if (!isIntOrBool(AL.getArgAsExpr(0))) {
+  Expr *First = AL.getArgAsExpr(0);
+  if (isa<StringLiteral>(First) || !isIntOrBool(First)) {
     S.Diag(AL.getLoc(), diag::err_attribute_argument_n_type)
----------------
Test coverage for these changes?


================
Comment at: clang/lib/Sema/SemaDeclCXX.cpp:16470
   StringLiteral *Lit = cast<StringLiteral>(LangStr);
-  if (!Lit->isOrdinary()) {
-    Diag(LangStr->getExprLoc(), diag::err_language_linkage_spec_not_ascii)
-      << LangStr->getSourceRange();
-    return nullptr;
-  }
+  assert(Lit->isUnevaluated() && "Unexpected string literal kind");
 
----------------
Test coverage for changes?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105759/new/

https://reviews.llvm.org/D105759



More information about the cfe-commits mailing list