[Lldb-commits] [lldb] r282090 - Fix failing regex tests.

Zachary Turner via lldb-commits lldb-commits at lists.llvm.org
Wed Sep 21 17:11:45 PDT 2016


Incidentally, it appears the version of regcomp() on apple supports REG_PEND
<https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man3/regcomp.3.html>
, although that is not a standard.  And apple is already the only platform
where enhanced regexes are a thing.  So if you want to keep enhanced
regexes, one thing we could do is:

a) Have all non Apple platforms delegate to the LLVM implementation.
b) Have Apple's implementation use the system library but set the end
pointer.

That said, if you don't care about enhanced mode, let's just have everyone
use LLVM's.

On Wed, Sep 21, 2016 at 5:03 PM Zachary Turner <zturner at google.com> wrote:

> It supports extended, just not *enhanced*.  Enhanced appears to be an
> apple specific thing, and even the developer documentation recommends not
> using it in portable code.
>
> On Wed, Sep 21, 2016 at 5:00 PM Jim Ingham <jingham at apple.com> wrote:
>
> IIRC you said the llvm one doesn't support extended regular expressions.
> If that's true, please don't do this till that is fixed.  I think pretty
> much everybody who is using reg expressions expects to be able to use
> extended regular expressions.
>
> Jim
>
> > On Sep 21, 2016, at 4:58 PM, Zachary Turner via lldb-commits <
> lldb-commits at lists.llvm.org> wrote:
> >
> > Incidentally even the STL regex implementation supports specifying the
> end pointer.  So I would call the system one deficient in this regard and
> advocate for replacing it sooner rather than later.
> >
> > On Wed, Sep 21, 2016 at 4:52 PM Zachary Turner <zturner at google.com>
> wrote:
> > Looks like llvm's regex is better than LLDB's in this regard, since it
> supports explicitly setting the end pointer.  I can see a couple options:
> >
> > 1) Check if it's null terminated by peeking one past the end, and
> copying if it's not.  This is pretty hackish, not crazy about this idea.
> > 2) Un-delete the const char * version of the function but leave the
> StringRef overload, find all places where I added the explicit conversion
> and remove them so they invoke the const char* overload.
> > 3) Change lldb::RegularExpression to just delegate to llvm under the
> hood and set the end pointer.
> >
> > On Wed, Sep 21, 2016 at 4:44 PM Zachary Turner <zturner at google.com>
> wrote:
> > Actually wait, it doesn't.  It just explicitly sets the end pointer.
> >
> > On Wed, Sep 21, 2016 at 4:44 PM Zachary Turner <zturner at google.com>
> wrote:
> > Worth noting that llvm::Regex has this constructor:
> >
> >
> > Regex::Regex(StringRef regex, unsigned Flags) {
> >   unsigned flags = 0;
> >   preg = new llvm_regex();
> >   preg->re_endp = regex.end();
> >   if (Flags & IgnoreCase)
> >     flags |= REG_ICASE;
> >   if (Flags & Newline)
> >     flags |= REG_NEWLINE;
> >   if (!(Flags & BasicRegex))
> >     flags |= REG_EXTENDED;
> >   error = llvm_regcomp(preg, regex.data(), flags|REG_PEND);
> > }
> >
> > So it assumes null termination even though you have a StringRef.
> >
> > On Wed, Sep 21, 2016 at 4:43 PM Zachary Turner <zturner at google.com>
> wrote:
> > You need to duplicate something on the heap once when you execute the
> regex.  And in turn you save tens or hundreds or copies on the way there
> because of inefficient string usage.
> >
> > We could also just un-delete the overload that takes a const char*, then
> the duplication would only ever happen when you explicitly use a StringRef.
> >
> > I don't agree this should be reverted.  In the process of doing this
> conversion I eliminated numerous careless string copies.
> >
> > On Wed, Sep 21, 2016 at 4:38 PM Greg Clayton <gclayton at apple.com> wrote:
> > This it the perfect example of why not to use a StringRef since the
> string needs to be null terminated. Why did we change this? Now even if you
> call this function:
> >
> > RegularExpression r(...);
> >
> > r.Execute(".......................", ...)
> >
> > You will need to duplicate the string on the heap just to execute this.
> Please revert this. Anything that requires null terminate is not a
> candidate for converting to StringRef.
> >
> >
> > > On Sep 21, 2016, at 10:13 AM, Zachary Turner via lldb-commits <
> lldb-commits at lists.llvm.org> wrote:
> > >
> > > Author: zturner
> > > Date: Wed Sep 21 12:13:51 2016
> > > New Revision: 282090
> > >
> > > URL: http://llvm.org/viewvc/llvm-project?rev=282090&view=rev
> > > Log:
> > > Fix failing regex tests.
> > >
> > > r282079 converted the regular expression interface to accept
> > > and return StringRefs instead of char pointers.  In one case
> > > a null pointer check was converted to an empty string check,
> > > but this was an incorrect conversion because an empty string
> > > is a valid regular expression.  Removing this check should
> > > fix the test failures.
> > >
> > > Modified:
> > >    lldb/trunk/source/Core/RegularExpression.cpp
> > >
> > > Modified: lldb/trunk/source/Core/RegularExpression.cpp
> > > URL:
> http://llvm.org/viewvc/llvm-project/lldb/trunk/source/Core/RegularExpression.cpp?rev=282090&r1=282089&r2=282090&view=diff
> > >
> ==============================================================================
> > > --- lldb/trunk/source/Core/RegularExpression.cpp (original)
> > > +++ lldb/trunk/source/Core/RegularExpression.cpp Wed Sep 21 12:13:51
> 2016
> > > @@ -102,7 +102,7 @@ bool RegularExpression::Compile(llvm::St
> > > //---------------------------------------------------------------------
> > > bool RegularExpression::Execute(llvm::StringRef str, Match *match)
> const {
> > >   int err = 1;
> > > -  if (!str.empty() && m_comp_err == 0) {
> > > +  if (m_comp_err == 0) {
> > >     // Argument to regexec must be null-terminated.
> > >     std::string reg_str = str;
> > >     if (match) {
> > >
> > >
> > > _______________________________________________
> > > lldb-commits mailing list
> > > lldb-commits at lists.llvm.org
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
> >
> > _______________________________________________
> > lldb-commits mailing list
> > lldb-commits at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-commits/attachments/20160922/f67cc0c6/attachment-0001.html>


More information about the lldb-commits mailing list