[clang] fix(clang/**.py): fix invalid escape sequences (PR #94029)
DonĂ¡t Nagy via cfe-commits
cfe-commits at lists.llvm.org
Wed Mar 12 02:49:48 PDT 2025
NagyDonat wrote:
> to me it seemed like the `r"..."` strings are supposed to be used for regular expressions, and in this change you appear to transform those strings into plain old strings. Could you help me understand this?
In Python the `r` string prefix stands for a _raw_ string literal where the escape sequences are not interpreted (see [relevant part of the language reference](https://docs.python.org/3/reference/lexical_analysis.html#escape-sequences)).
The presence or absence of the "r" prefix does not influence the _type_ of the object represented by the string literal -- it only influences the _contents_ of the string object. For example the raw string literal `r"\n+"` (three characters: backslash, letter "n", plus) is exactly equivalent to the plain old string literal `"\\n+" (where the two backslashes are interpreted as an escape sequence that produces a single backslash). (Note that without the `r` the literal `"\n+"` consists of two characters: a newline and a plus sign.)
Raw strings are indeed frequently used for regular expressions, because a string that represents a regexp usually contains many backslashes and it's more comfortable to specify them as a raw string literal -- but there is no formal connection between them. (Unlike languages like Perl or shell scripts, regular expressions in Python are purely implemented within the standard library, there is no special syntax for them.)
https://github.com/llvm/llvm-project/pull/94029
More information about the cfe-commits
mailing list