[Lldb-commits] [PATCH] D66174: [Utility] Reimplement RegularExpression on top of llvm::Regex

Pavel Labath via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Thu Aug 15 02:40:26 PDT 2019


labath added a comment.

I was unhappy with the `()` workaround, so I did some investigation. Now, I am even more unhappy, but at least better educated. :P

Basically, what I've learned is that according to POSIX https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html an empty string is not a valid regular expression. Most tools seem to ignore that, and treat it as if it matches everything (and I can't say I blame them). However, the BSD implementation (that the llvm stuff is based on) tried to do a strict implementation and it rejects that as an invalid expression.

Funnily enough, `()` is not valid according to POSIX either, but the BSD implementation accepts it. So does (linux) grep and python. However, perl rejects it (it still considers it a valid expression, but one that doesn't match anything).

`a||b` (a OR empty pattern OR b) is also not valid. grep, python and perl accept that, but the BSD regexes don't.

So, overall, it seems to me that there is a lot of confusion about what should empty (sub-)patterns do. I don't think special-casing "" in lldb_private::RegularExpression would help alleviate any of that confusion as we would be still left with all of the other inconsistencies. So, if we want to go through with this (and I still think we should), I guess we'll just have to bite the bullet and say that our expressions are now (more or less) POSIX conformant, and repeat that to anyone who comes complaining that lldb regexes behave differently than grep, python or older versions of lldb...



================
Comment at: lldb/source/Commands/CommandObjectFrame.cpp:577-578
+              if (llvm::Error err = regex.GetError())
+                result.GetErrorStream().Printf(
+                    "error: %s\n", llvm::toString(std::move(err)).c_str());
               else
----------------
`GetErrorStream().Format("error: {0}\n", llvm::fmt_consume(std::move(err)));` would be slightly less verbose.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66174/new/

https://reviews.llvm.org/D66174





More information about the lldb-commits mailing list