<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">The other thing would be to try and move the demangler to use a custom allocator everywhere. Not sure what demangler you are using when you are doing these tests, but we can either use the native system one from the #include <cxxabi.h>, or the fast demangler in FastDemangle.cpp. If it is the latter, then we can probably optimize this. <div class=""><br class=""></div><div class="">The other thing to note is local files will be mmap'ed in and paging doesn't really show up on perf tests very well, so it will look like system time when the system is paging in pages from the symbol files as it reads them from memory. You could try disabling the mmap stuff in DataBufferLLVM.cpp and see if you see any difference. The call to llvm::MemoryBuffer::getFileSlice() takes a Volatile as its last argument. If you set this to true, we will read the file into memory instead of mmap'ing it. This will help you at least see if there is any component of the time that is due to mmap'ing. Currently we look to see if the file is local (not on a network mount). If it is local we mmap it. </div><div class=""><br class=""></div><div class="">Greg</div><div class=""><div class=""><br class=""></div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On May 2, 2017, at 12:31 PM, Scott Smith <<a href="mailto:scott.smith@purestorage.com" class="">scott.smith@purestorage.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class="">As it turns out, it was lock contention in the memory allocator.  Using tcmalloc brought it from 8+ seconds down to 4.2.<br class=""><br class=""></div><div class="">I think this didn't show up in mutrace because glibc's malloc doesn't use pthread mutexes.<br class=""><br class=""></div><div class="">Greg, that joke about adding tcmalloc wholesale is looking less funny and more serious....  Or maybe it's enough to make it a cmake link option (use if present or use if requested).<br class=""></div><div class="gmail_extra"><br class=""><div class="gmail_quote">On Tue, May 2, 2017 at 8:42 AM, Jim Ingham <span dir="ltr" class=""><<a href="mailto:jingham@apple.com" target="_blank" class="">jingham@apple.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I'm not sure about Linux, on OS X lldb will mmap the debug information rather that using straight reads.  But that should just be once per loaded module.<br class="">

<span class="m_-5012448146526590794HOEnZb"><font color="#888888" class=""><br class="">

Jim<br class="">

</font></span><div class="m_-5012448146526590794HOEnZb"><div class="m_-5012448146526590794h5"><br class="">

> On May 2, 2017, at 8:09 AM, Scott Smith via lldb-dev <<a href="mailto:lldb-dev@lists.llvm.org" target="_blank" class="">lldb-dev@lists.llvm.org</a>> wrote:<br class="">

><br class="">

> I've been trying to improve the parallelism of lldb but have run into an odd roadblock.  I have the code at the point where it creates 40 worker threads, and it stays that way because it has enough work to do.  However, running 'top -d 1' shows that for the time in question, cpu load never gets above 4-8 cpus (even though I have 40).<br class="">

><br class="">

> 1. I tried mutrace, which measures mutex contention (I had to call unsetenv("LD_PRELOAD") in main() so it wouldn't propagate to the process being tested).  It indicated some minor contention, but not enough to be the problem.  Regardless, I converted everything I could to lockfree structures (TaskPool and ConstString) and it didn't help.<br class="">

><br class="">

> 2. I tried strace, but I don't think strace can figure out how to trace lldb.  It says it waits on a single futex for 8 seconds, and then is done.<br class="">

><br class="">

> I'm about to try lttng to trace all syscalls, but I was wondering if anyone else had any ideas?  At one point I wondered if it was mmap kernel semaphore contention, but that shouldn't affect faulting individual pages, and I assume lldb doesn't call mmap all the time.<br class="">

><br class="">

> I'm getting a bit frustrated because lldb should be taking 1-2 seconds to start up (it has ~45s of user+system work to do), but instead is taking 8-10, and I've been stuck there for a while.<br class="">

><br class="">

</div></div><div class="m_-5012448146526590794HOEnZb"><div class="m_-5012448146526590794h5">> ______________________________<wbr class="">_________________<br class="">

> lldb-dev mailing list<br class="">

> <a href="mailto:lldb-dev@lists.llvm.org" target="_blank" class="">lldb-dev@lists.llvm.org</a><br class="">

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/<wbr class="">mailman/listinfo/lldb-dev</a><br class="">

<br class="">

</div></div></blockquote></div><br class=""></div></div>

</div></blockquote></div><br class=""></div></div></body></html>