<div dir="ltr">We have root caused the segfault - it was due to a caching layer we have in our code, which is to avoid duplicate compilations. Basically, llvm::JIT::getPointerToFunction() looks up PassRegistry, but as our change introduces a separate PassRegistry for each thread, this means that the thread that calls llvm::JIT::getPointerToFunction() should have appropriate PassRegistry setup. In our setup, some threads were witnessing a cache hit on a code that was compiled by another thread, but when such a thread called llvm::JIT::getPointerToFunction(), it was getting a segfault as its PassRegistry was not setup.<div><br></div><div>Any comments on the our change to the PassRegistry?</div><div><br></div><div>Thanks,</div><div>Nipun</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Feb 24, 2015 at 1:50 PM, Nipun Sehrawat <span dir="ltr"><<a href="mailto:nipun@thoughtspot.com" target="_blank">nipun@thoughtspot.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi,<div><br></div><div>We use LLVM libraries to compile C++ code and noticed slow downs when multiple threads of a process were compiling at once. <i>perf </i>indicated that most of the CPU time was spent in a spin lock, which was being locked/unlocked from llvm::PassRegistry::getPassInfo().</div><div><br></div><div>We read the relevant LLVM code and found out that PassRegistry is a ManagedStatic and is shared among all threads in case of a multi-threaded setup. This sharing requires locking in PassRegistry's method, which becomes source of the contention. To get rid of the contention, we made a change to make PassRegistry thread-local and got rid of the locking. This removed all the contention and we noticed a 2x speed up in single thread compiles and 7x improvement when ten threads were compiling in parallel.</div><div><br></div><div>Please find attached the diff for this change. We are using a old version of LLVM code (svn revision 170375), so the code might be quite outdated.</div><div><br></div><div>We have two questions:</div><div>1. Does the change look reasonable? Or are we missing something here?</div><div>2. When we run  with 1000 threads compiling concurrently, we deterministically run into a segfault in PassRegistry lookup. Any insights into the segfault?</div><div><br></div><div><br></div><div>Please find attached the following files:</div><div>1. <i>pass_registry.txt</i>: Git diff of our change. Note that it is against LLVM svn revision 170375.</div><div>2.<i> contention.txt</i>: Perf report with existing LLVM code - shows contention in llvm::PassRegistry::getPassInfo()</div><div>3. <i>no_contention.txt</i>: Perf report of LLVM built with our change.</div><div>4. <i>segfault.txt</i>: Segfault we are encountering after our change.</div><div>5. <i>clang_compile.cpp</i>: Snippet of code we use to compile code using LLVM.</div><div><br></div><div><br></div><div>Thanks a lot,</div><div>Nipun</div></div>

</blockquote></div><br></div>