<div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr">On Wed, Sep 21, 2016 at 5:13 PM Greg Clayton <<a href="mailto:gclayton@apple.com">gclayton@apple.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br class="gmail_msg">
> On Sep 21, 2016, at 4:43 PM, Zachary Turner <<a href="mailto:zturner@google.com" class="gmail_msg" target="_blank">zturner@google.com</a>> wrote:<br class="gmail_msg">
><br class="gmail_msg">
> You need to duplicate something on the heap once when you execute the regex. And in turn you save tens or hundreds or copies on the way there because of inefficient string usage.<br class="gmail_msg">
<br class="gmail_msg">
Where? please show this.<br class="gmail_msg">
<br class="gmail_msg">
I see the following callers:<br class="gmail_msg">
<br class="gmail_msg">
const char *class_name =<br class="gmail_msg">
iterator->second->GetClassName().AsCString("<unknown>");<br class="gmail_msg">
if (regex_up && class_name &&<br class="gmail_msg">
!regex_up->Execute(llvm::StringRef(class_name)))<br class="gmail_msg">
<br class="gmail_msg">
You are adding a strlen() call here to construct the StringRef, not saving anything.<br class="gmail_msg"></blockquote><div>Right, this is only this way because i deleted the const char * version of the function. Which as I explained in a previous email, I did because it forces the compiler to tell you when you're using a raw string literal, so you can manually examine it and make sure it's not null, since that would cause it to crash. So you're right, here it doesn't save you anything, but it doesn't hurt anything either, and the explicit conversion helps catch bugs in OTHER places. Which again, is only temporary until all the conversions are done, and we can remove the =delete overloads.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br class="gmail_msg">
<br class="gmail_msg">
bool CommandObjectRegexCommand::DoExecute(const char *command,<br class="gmail_msg">
CommandReturnObject &result) {<br class="gmail_msg">
if (command) {<br class="gmail_msg">
EntryCollection::const_iterator pos, end = m_entries.end();<br class="gmail_msg">
for (pos = m_entries.begin(); pos != end; ++pos) {<br class="gmail_msg">
RegularExpression::Match regex_match(m_max_matches);<br class="gmail_msg">
<br class="gmail_msg">
if (pos->regex.Execute(command, ®ex_match)) {<br class="gmail_msg">
std::string new_command(pos->command);<br class="gmail_msg">
std::string match_str;<br class="gmail_msg">
<br class="gmail_msg">
No copy saved. Just wasted time with strlen in StringRef constructor.<br class="gmail_msg"></blockquote><div>Right. But that's because DoExecute takes a const char*, not a StringRef. And again, doing the entire codebase all at once is infeasible. Think about how many places along the pipeline are using this command. This isn't the only place where we use it. We use it later in the function, we use it in the function above us on the callchain. We use it lower down on the callchain. And every time we do anything with it, we calculate the length. So yes, there's an extra one here. But again, it's only temporary until in a subsequent patch I change DoExecute() to take a StringRef. Can't do it all at once.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br class="gmail_msg">
<br class="gmail_msg">
DataVisualization::Categories::ForEach(<br class="gmail_msg">
[®ex, &result](const lldb::TypeCategoryImplSP &category_sp) -> bool {<br class="gmail_msg">
if (regex) {<br class="gmail_msg">
bool escape = true;<br class="gmail_msg">
if (regex->GetText() == category_sp->GetName()) {<br class="gmail_msg">
escape = false;<br class="gmail_msg">
} else if (regex->Execute(llvm::StringRef::withNullAsEmpty(<br class="gmail_msg">
category_sp->GetName()))) {<br class="gmail_msg">
escape = false;<br class="gmail_msg">
}<br class="gmail_msg">
<br class="gmail_msg">
if (escape)<br class="gmail_msg">
return true;<br class="gmail_msg">
}<br class="gmail_msg">
<br class="gmail_msg">
No copy saved. Just wasted time with strlen in StringRef constructor.<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
TypeCategoryImpl::ForEachCallbacks<FormatterType> foreach;<br class="gmail_msg">
foreach<br class="gmail_msg">
.SetExact([&result, &formatter_regex, &any_printed](<br class="gmail_msg">
ConstString name,<br class="gmail_msg">
const FormatterSharedPointer &format_sp) -> bool {<br class="gmail_msg">
if (formatter_regex) {<br class="gmail_msg">
bool escape = true;<br class="gmail_msg">
if (name.GetStringRef() == formatter_regex->GetText()) {<br class="gmail_msg">
escape = false;<br class="gmail_msg">
} else if (formatter_regex->Execute(name.GetStringRef())) {<br class="gmail_msg">
escape = false;<br class="gmail_msg">
}<br class="gmail_msg">
<br class="gmail_msg">
No copy saved. Just wasted time with strlen in StringRef constructor.<br class="gmail_msg">
<br class="gmail_msg">
bool ParseCoordinate(const char *id_cstr) {<br class="gmail_msg">
RegularExpression regex;<br class="gmail_msg">
RegularExpression::Match regex_match(3);<br class="gmail_msg">
<br class="gmail_msg">
llvm::StringRef id_ref = llvm::StringRef::withNullAsEmpty(id_cstr);<br class="gmail_msg">
bool matched = false;<br class="gmail_msg">
if (regex.Compile(llvm::StringRef("^([0-9]+),([0-9]+),([0-9]+)$")) &&<br class="gmail_msg">
regex.Execute(id_ref, ®ex_match))<br class="gmail_msg">
matched = true;<br class="gmail_msg">
else if (regex.Compile(llvm::StringRef("^([0-9]+),([0-9]+)$")) &&<br class="gmail_msg">
regex.Execute(id_ref, ®ex_match))<br class="gmail_msg">
matched = true;<br class="gmail_msg">
else if (regex.Compile(llvm::StringRef("^([0-9]+)$")) &&<br class="gmail_msg">
regex.Execute(id_ref, ®ex_match))<br class="gmail_msg">
<br class="gmail_msg">
No copy saved. Just wasted time with strlen in StringRef constructor.<br class="gmail_msg"></blockquote><div>Again, the solution is to have ParseCoordinate take a StringRef. Problem solved.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br class="gmail_msg">
void DWARFCompileUnit::ParseProducerInfo() {<br class="gmail_msg">
m_producer_version_major = UINT32_MAX;<br class="gmail_msg">
m_producer_version_minor = UINT32_MAX;<br class="gmail_msg">
m_producer_version_update = UINT32_MAX;<br class="gmail_msg">
<br class="gmail_msg">
const DWARFDebugInfoEntry *die = GetCompileUnitDIEPtrOnly();<br class="gmail_msg">
if (die) {<br class="gmail_msg">
<br class="gmail_msg">
const char *producer_cstr = die->GetAttributeValueAsString(<br class="gmail_msg">
m_dwarf2Data, this, DW_AT_producer, NULL);<br class="gmail_msg">
if (producer_cstr) {<br class="gmail_msg">
RegularExpression llvm_gcc_regex(<br class="gmail_msg">
llvm::StringRef("^4\\.[012]\\.[01] \\(Based on Apple "<br class="gmail_msg">
"Inc\\. build [0-9]+\\) \\(LLVM build "<br class="gmail_msg">
"[\\.0-9]+\\)$"));<br class="gmail_msg">
if (llvm_gcc_regex.Execute(llvm::StringRef(producer_cstr))) {<br class="gmail_msg">
m_producer = eProducerLLVMGCC;<br class="gmail_msg">
} else if (strstr(producer_cstr, "clang")) {<br class="gmail_msg">
static RegularExpression g_clang_version_regex(<br class="gmail_msg">
llvm::StringRef("clang-([0-9]+)\\.([0-9]+)\\.([0-9]+)"));<br class="gmail_msg">
RegularExpression::Match regex_match(3);<br class="gmail_msg">
if (g_clang_version_regex.Execute(llvm::StringRef(producer_cstr),<br class="gmail_msg">
®ex_match)) {<br class="gmail_msg">
<br class="gmail_msg">
No copy saved. Just wasted time with strlen in StringRef constructor (2 of them mind you).<br class="gmail_msg"></blockquote><div>In this case the solution is to have GetAttributeValueAsString *return* a StringRef. Problem solved again. Note that there are already 2 strlens in the original code. One internally in each execution of the regular expression. Now there are 4. Once GetAttributeValueAsString() returns a StringRef, there will be 0.</div><div> </div><div><br></div><div>And similarly, I could keep going, but I think you get the idea.</div></div></div>