[Lldb-commits] [lldb] r282090 - Fix failing regex tests.

Zachary Turner via lldb-commits lldb-commits at lists.llvm.org
Wed Sep 21 17:27:12 PDT 2016


On Wed, Sep 21, 2016 at 5:13 PM Greg Clayton <gclayton at apple.com> wrote:

>
> > On Sep 21, 2016, at 4:43 PM, Zachary Turner <zturner at google.com> wrote:
> >
> > You need to duplicate something on the heap once when you execute the
> regex.  And in turn you save tens or hundreds or copies on the way there
> because of inefficient string usage.
>
> Where? please show this.
>
> I see the following callers:
>
>           const char *class_name =
>               iterator->second->GetClassName().AsCString("<unknown>");
>           if (regex_up && class_name &&
>               !regex_up->Execute(llvm::StringRef(class_name)))
>
> You are adding a strlen() call here to construct the StringRef, not saving
> anything.
>
Right, this is only this way because i deleted the const char * version of
the function.  Which as I explained in a previous email, I did because it
forces the compiler to tell you when you're using a raw string literal, so
you can manually examine it and make sure it's not null, since that would
cause it to crash.  So you're right, here it doesn't save you anything, but
it doesn't hurt anything either, and the explicit conversion helps catch
bugs in OTHER  places.  Which again, is only temporary until all the
conversions are done, and we can remove the =delete overloads.


>
>
> bool CommandObjectRegexCommand::DoExecute(const char *command,
>                                           CommandReturnObject &result) {
>   if (command) {
>     EntryCollection::const_iterator pos, end = m_entries.end();
>     for (pos = m_entries.begin(); pos != end; ++pos) {
>       RegularExpression::Match regex_match(m_max_matches);
>
>       if (pos->regex.Execute(command, &regex_match)) {
>         std::string new_command(pos->command);
>         std::string match_str;
>
> No copy saved. Just wasted time with strlen in StringRef constructor.
>
Right.  But that's because DoExecute takes a const char*, not a StringRef.
And again, doing the entire codebase all at once is infeasible.  Think
about how many places along the pipeline are using this command.  This
isn't the only place where we use it.  We use it later in the function, we
use it in the function above us on the callchain.  We use it lower down on
the callchain.  And every time we do anything with it, we calculate the
length.  So yes, there's an extra one here.  But again, it's only temporary
until in a subsequent patch I change DoExecute() to take a StringRef.
Can't do it all at once.


>
>
>     DataVisualization::Categories::ForEach(
>         [&regex, &result](const lldb::TypeCategoryImplSP &category_sp) ->
> bool {
>           if (regex) {
>             bool escape = true;
>             if (regex->GetText() == category_sp->GetName()) {
>               escape = false;
>             } else if (regex->Execute(llvm::StringRef::withNullAsEmpty(
>                            category_sp->GetName()))) {
>               escape = false;
>             }
>
>             if (escape)
>               return true;
>           }
>
> No copy saved. Just wasted time with strlen in StringRef constructor.
>
>
>       TypeCategoryImpl::ForEachCallbacks<FormatterType> foreach;
>       foreach
>         .SetExact([&result, &formatter_regex, &any_printed](
>                       ConstString name,
>                       const FormatterSharedPointer &format_sp) -> bool {
>           if (formatter_regex) {
>             bool escape = true;
>             if (name.GetStringRef() == formatter_regex->GetText()) {
>               escape = false;
>             } else if (formatter_regex->Execute(name.GetStringRef())) {
>               escape = false;
>             }
>
> No copy saved. Just wasted time with strlen in StringRef constructor.
>
>     bool ParseCoordinate(const char *id_cstr) {
>       RegularExpression regex;
>       RegularExpression::Match regex_match(3);
>
>       llvm::StringRef id_ref = llvm::StringRef::withNullAsEmpty(id_cstr);
>       bool matched = false;
>       if (regex.Compile(llvm::StringRef("^([0-9]+),([0-9]+),([0-9]+)$")) &&
>           regex.Execute(id_ref, &regex_match))
>         matched = true;
>       else if (regex.Compile(llvm::StringRef("^([0-9]+),([0-9]+)$")) &&
>                regex.Execute(id_ref, &regex_match))
>         matched = true;
>       else if (regex.Compile(llvm::StringRef("^([0-9]+)$")) &&
>                regex.Execute(id_ref, &regex_match))
>
> No copy saved. Just wasted time with strlen in StringRef constructor.
>
Again, the solution is to have ParseCoordinate take a StringRef.  Problem
solved.


>
> void DWARFCompileUnit::ParseProducerInfo() {
>   m_producer_version_major = UINT32_MAX;
>   m_producer_version_minor = UINT32_MAX;
>   m_producer_version_update = UINT32_MAX;
>
>   const DWARFDebugInfoEntry *die = GetCompileUnitDIEPtrOnly();
>   if (die) {
>
>     const char *producer_cstr = die->GetAttributeValueAsString(
>         m_dwarf2Data, this, DW_AT_producer, NULL);
>     if (producer_cstr) {
>       RegularExpression llvm_gcc_regex(
>           llvm::StringRef("^4\\.[012]\\.[01] \\(Based on Apple "
>                           "Inc\\. build [0-9]+\\) \\(LLVM build "
>                           "[\\.0-9]+\\)$"));
>       if (llvm_gcc_regex.Execute(llvm::StringRef(producer_cstr))) {
>         m_producer = eProducerLLVMGCC;
>       } else if (strstr(producer_cstr, "clang")) {
>         static RegularExpression g_clang_version_regex(
>             llvm::StringRef("clang-([0-9]+)\\.([0-9]+)\\.([0-9]+)"));
>         RegularExpression::Match regex_match(3);
>         if (g_clang_version_regex.Execute(llvm::StringRef(producer_cstr),
>                                           &regex_match)) {
>
> No copy saved. Just wasted time with strlen in StringRef constructor (2 of
> them mind you).
>
In this case the solution is to have GetAttributeValueAsString *return* a
StringRef.  Problem solved again.  Note that there are already 2 strlens in
the original code.  One internally in each execution of the regular
expression.  Now there are 4.  Once GetAttributeValueAsString() returns a
StringRef, there will be 0.


And similarly, I could keep going, but I think you get the idea.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-commits/attachments/20160922/8f384447/attachment-0001.html>


More information about the lldb-commits mailing list