[Lldb-commits] [PATCH] D66174: [Utility] Reimplement RegularExpression on top of llvm::Regex
Jonas Devlieghere via Phabricator via lldb-commits
lldb-commits at lists.llvm.org
Thu Aug 15 08:01:13 PDT 2019
JDevlieghere added a comment.
In D66174#1631055 <https://reviews.llvm.org/D66174#1631055>, @labath wrote:
> I was unhappy with the `()` workaround, so I did some investigation. Now, I am even more unhappy, but at least better educated. :P
> Basically, what I've learned is that according to POSIX https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html an empty string is not a valid regular expression. Most tools seem to ignore that, and treat it as if it matches everything (and I can't say I blame them). However, the BSD implementation (that the llvm stuff is based on) tried to do a strict implementation and it rejects that as an invalid expression.
> Funnily enough, `()` is not valid according to POSIX either, but the BSD implementation accepts it. So does (linux) grep and python. However, perl rejects it (it still considers it a valid expression, but one that doesn't match anything).
> `a||b` (a OR empty pattern OR b) is also not valid. grep, python and perl accept that, but the BSD regexes don't.
> So, overall, it seems to me that there is a lot of confusion about what should empty (sub-)patterns do. I don't think special-casing "" in lldb_private::RegularExpression would help alleviate any of that confusion as we would be still left with all of the other inconsistencies. So, if we want to go through with this (and I still think we should), I guess we'll just have to bite the bullet and say that our expressions are now (more or less) POSIX conformant, and repeat that to anyone who comes complaining that lldb regexes behave differently than grep, python or older versions of lldb...
For the record, in case my earlier response gets lost in the inline comments, I strongly support this :-)
CHANGES SINCE LAST ACTION
More information about the lldb-commits