[PATCH] D16529: [clang-tidy] Add modernize-raw-string-literal check

Wed Jan 27 12:11:25 PST 2016

LegalizeAdulthood added inline comments.

================
Comment at: clang-tidy/modernize/RawStringLiteralCheck.cpp:27
@@ +26,3 @@
+  // we only transform ASCII string literals
+  if (!Literal->isAscii())
+    return false;
----------------
alexfh wrote:
> Why can't the check convert non-ascii string literals to corresponding raw string literals?
Just doing one thing at a time and not trying to create a "kitchen sink" check on the first pass.

================
Comment at: clang-tidy/modernize/RawStringLiteralCheck.cpp:57
@@ +56,3 @@
+
+  // already a raw string literal if R comes before "
+  if (Text.find_first_of("R") < Text.find_first_of(R"(")"))
----------------
alexfh wrote:
> nit: Capitalization, punctuation. Same for other comments in the file.
I don't understand what exactly you are asking me to change.  There is something wrong with the comment?

================
Comment at: clang-tidy/modernize/RawStringLiteralCheck.cpp:77
@@ +76,3 @@
+  std::string Delimiter;
+  for (int Counter = 0; containsDelimiter(Bytes, Delimiter); ++Counter) {
+    if (Counter == 0) {
----------------
alexfh wrote:
> Why don't you try an empty delimiter? It will cover a majority of cases.
> 
> Also, even though the function is used only when generating fix-it hints, it still should be more effective than it is now. Most of the heap allocations and copies caused by the usage of std::ostringstream, std::string and std::string::operator+ can be avoided, e.g. like this:
> 
>   static const StringRef Delimiters[2][] =
>     {{"", "R\"(", ")\""}, {"lit", "R\"lit(", ")lit\""}, {"str", "R\"str(", ")str\""},
>     /* add more different ones to make it unlikely to meet all of these in a single literal in the wild */};
>   for (const auto &Delim : Delimiters) {
>     if (Bytes.find(Delim[2]) != StringRef::npos)
>       return (Delim[1] + Bytes + Delim[2]).str();
>   }
>   // Give up gracefully on literals having all our delimiters.
The very first thing tried is an empty delimiter.

I coded a solution to the problem that always works.

I find it less understandable to try a bunch of random delimiters and then fail if they are all present in the string.  Then the whole algorithm becomes more complicated because even though I **could** construct a fixit, I'm failing stupidly because all my "favorite" delimiters were used in your string.

Sometimes I feel these reviews over-obsess with allocations and efficiency instead of implementing a simple general algorithm that works reliably.

We don't even have any measurements to show that this performance consideration is dominant or even noticeable, but I'm being asked to take a perfectly correct algorithm and cram it into a StringRef corset.

================
Comment at: clang-tidy/modernize/RawStringLiteralCheck.cpp:103
@@ +102,3 @@
+  // raw string literals require C++11 or later
+  if (!Options.CPlusPlus11 && !Options.CPlusPlus14 && !Options.CPlusPlus1z)
+    return;
----------------
aaron.ballman wrote:
> Can just use !Options.CPlusPlus11 -- 11 is implied in 14 and 1z.
Then the documentation needs to be updated.  (Actually I couldn't find any documentation on these flags and how they relate to each other because they use preprocessor hell to generate bitfields in a structure.)

================
Comment at: clang-tidy/modernize/RawStringLiteralCheck.cpp:117
@@ +116,3 @@
+        Result.Context->getLangOpts());
+    StringRef Replacement = asRawStringLiteral(Literal);
+    diag(Literal->getLocStart(),
----------------
alexfh wrote:
> `Replacement` will point to deleted memory right after this line.
For the record: I **hate** StringRef.  I spend about 80% of my time on clang-tidy submissions dicking around with StringRef based on all the review comments.  It is a very hard class to use properly.  If a usage like this results in referencing deleted memory, it shouldn't even compile.

http://reviews.llvm.org/D16529