[PATCH] D16529: [clang-tidy] Add modernize-raw-string-literal check

Alexander Kornienko via cfe-commits cfe-commits at lists.llvm.org
Fri Feb 12 07:11:39 PST 2016


alexfh added a comment.

Sorry for the delay. I'm trying to prioritize reviews that are taking less time per-iteration. Unfortunately, here we have a bunch of disagreements and I have to spend significant time to read through your arguments and address your points.


================
Comment at: clang-tidy/modernize/RawStringLiteralCheck.cpp:75
@@ +74,3 @@
+
+  return containsEscapes(Text, R"lit('\"?x01)lit");
+}
----------------
Please remove `lit`. I understand your desire to make the string less 'crowded', but it's totally uncommon for this code to use delimiters in raw strings unless absolutely necessary.

================
Comment at: clang-tidy/modernize/RawStringLiteralCheck.cpp:79
@@ +78,3 @@
+bool containsDelimiter(StringRef Bytes, const std::string &Delimiter) {
+  return Bytes.find(")" + Delimiter + R"quote(")quote") != StringRef::npos;
+}
----------------
Please remove `quote`. I understand your desire to make the string less 'crowded', but it's totally uncommon for this code to use delimiters in raw strings unless absolutely necessary.

================
Comment at: clang-tidy/modernize/RawStringLiteralCheck.cpp:92
@@ +91,3 @@
+
+bool isMacroExpansionLocation(const MatchFinder::MatchResult &Result,
+                              SourceLocation Loc) {
----------------
This is not needed, see the comment below at the call to this function.

================
Comment at: clang-tidy/modernize/RawStringLiteralCheck.cpp:97
@@ +96,3 @@
+}
+
+} // namespace
----------------
> I believe we agree on the following: ...

Yes.

> Where we seem to disagree is what algorithm should be used to determine the 
> delimiter when a delimited raw string literal is required.

Pretty much so. IMO, we don't need a universal algorithm that will hardly ever go further than the second iteration, and even in this case would produce a result that's likely undesirable (as I said, `R"lit0(....)lit0"` is not what I would want to see in my code).

The possibility of this code causing performance issues is probably not that high, but I can imagine a code where this could be sub-optimal.

> My implementation is the minimum performance impact for the typical case 
> where the string does not contain )".

One concern about this part is that it could issue 4 concatenation calls fewer in case of an empty delimiter. This may already be handled well by the `llvm::Twine` though.

Any code following potentially-problematic patterns might be fine on its own, but it will attract unnecessary attention when reading code. High frequency of such false alarms has non-trivial cost for code readers and makes it harder to find actual problems.

So please, change the code to avoid these issues. Here's a possible alternative:

  llvm::Optional<std::string> asRawStringLiteral(const StringLiteral *Literal) {
    const StringRef Bytes = Literal->getBytes();
    static const StringRef Delimiters[2][] =
      {{"R\"(", ")\""}, {"R\"lit(", ")lit\""}, {"R\"str(", ")str\""},
      /* add more different ones to make it unlikely to meet all of these in a single literal in the wild */};
    for (const auto &Delim : Delimiters) {
      if (Bytes.find(Delim[1]) != StringRef::npos)
        return (Delim[0] + Bytes + Delim[1]).str();
    }
    // Give up gracefully on literals having all our delimiters.
    return llvm::None;
  }

================
Comment at: clang-tidy/modernize/RawStringLiteralCheck.cpp:109
@@ +108,3 @@
+
+  if (const auto *Literal = Result.Nodes.getNodeAs<StringLiteral>("lit")) {
+    if (isMacroExpansionLocation(Result, Literal->getLocStart()))
----------------
There's no need to check the result of `getNodeAs` here, since the binding is done in a non-optional branch of the matcher and the type of the matcher corresponds to the template argument type of `getNodeAs`. At most, an `assert` is needed to simplify debugging in case the code is changed.

================
Comment at: clang-tidy/modernize/RawStringLiteralCheck.cpp:110
@@ +109,3 @@
+  if (const auto *Literal = Result.Nodes.getNodeAs<StringLiteral>("lit")) {
+    if (isMacroExpansionLocation(Result, Literal->getLocStart()))
+      return;
----------------
  if (Literal->getLocStart().isMacroID())
    return;

================
Comment at: test/clang-tidy/modernize-raw-string-literal.cpp:2
@@ +1,3 @@
+// RUN: %check_clang_tidy %s modernize-raw-string-literal %t
+
+char const *const BackSlash{"goink\\frob"};
----------------
This can be sorted out later. Please add a FIXME next to the relevant test for now.

================
Comment at: test/clang-tidy/modernize-raw-string-literal.cpp:47
@@ +46,3 @@
+char const *const AlreadyRaw{R"(foobie\\bletch)"};
+char const *const UTF8Literal{u8"foobie\\bletch"};
+char const *const UTF8RawLiteral{u8R"(foobie\\bletch)"};
----------------
Please add a FIXME to handle utf8, utf16, utf32 and wide string literals.


http://reviews.llvm.org/D16529





More information about the cfe-commits mailing list