[llvm] [Support] Add `\{<ref>}` backreferences in Regex::sub() (PR #67220)

Igor Kudrin via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 25 14:43:54 PDT 2023


igorkudrin wrote:

> I guess a workaround would be to add some empty braces to separate the matching number from the following digits? (`\3()42`)? Maybe that'd be sufficient, if a bit quirky? (I'm not totally opposed to this patch/feature addition either)

The problem is not about regexes, but substitutions, which have their own syntax. Basically, they copy the template into the output string, replacing sequences like "\\n" and "\\t" with '\n' and '\t', and substituting '\\' followed by digits with the found groups from the matched regex. This syntax has no its own groups, so adding "()" would just copy these characters to the output.

> How's this compare to other regex libraries & what syntax they use for this situation?

- [Python](https://docs.python.org/3/library/re.html#re.sub): "\\g<num>"
- [.Net](https://learn.microsoft.com/en-us/dotnet/standard/base-types/substitutions-in-regular-expressions#substituting-a-numbered-group): "${num}"
- [ECMAScript](https://tc39.es/ecma262/multipage/text-processing.html#sec-string.prototype.replace) doesn't seem to have a special syntax for group numbers, but has "$<name>" to substitute a named group, which can be used instead.

https://github.com/llvm/llvm-project/pull/67220


More information about the llvm-commits mailing list