<div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jan 22, 2015 at 12:25 AM, Dmitry Vyukov <span dir="ltr"><<a href="mailto:dvyukov@google.com" target="_blank">dvyukov@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class=""><div class="">On Thu, Jan 22, 2015 at 11:14 AM, Chandler Carruth <<a href="mailto:chandlerc@google.com">chandlerc@google.com</a>> wrote:<br>

><br>

> On Wed, Jan 21, 2015 at 11:38 PM, Dmitry Vyukov <<a href="mailto:dvyukov@google.com">dvyukov@google.com</a>> wrote:<br>

>><br>

>> You are right.<br>

>> But this optimization is too fruitful to just discard it. So fruitful<br>

>> (-40% of instrumentation in a large webrtc test) that I am inclining<br>

>> towards ignoring the possibility of passing the object using relaxed<br>

>> atomics... On the other hand people do mess memory ordering, so losing<br>

>> these races is pity as well...<br>

><br>

><br>

> It would make me very sad to lose this feature of TSan. Of all the subtle<br>

> racy-queue techniques I have seen or heard of over the years, the one I<br>

> cited is actually one of the few that I have seen debugged specifically<br>

> through the use of TSan.<br>

><br>

> I also fear losing it in small part because it is a specific portability<br>

> risk between x86 and weak memory architectures, one of the biggest features<br>

> of TSan for me.<br>

><br>

> But it's wild that this is 40% of the instrumentation in a large webrtc<br>

> test. That seems to clearly indicate that there is *something* to be done<br>

> here, but I don't know yet what that is... so:<br>

><br>

>><br>

>> Maybe we can figure out a way to get both at least in most cases.<br>

><br>

><br>

> That would be my hope as well. =]<br>

><br>

>><br>

>> Few<br>

>> observations:<br>

>> 1. Leaking of stack objects to other threads is very infrequent (I<br>

>> would say 1%).<br>

><br>

><br>

> Infrequent relative to *captured* stack objects? Yes, but I'm not sure how<br>

> infrequent really. Mostly this is because I expect most stack objects to<br>

> never be captured.<br>

<br>

</div></div>If I do:<br>

<br>

std::string s(...);<br>

s.find(...);<br>

<br>

and find is not inlined, but ctor is inlined. Is not it the case that<br>

s is captured, but stores in ctor can elided by this optimization?</blockquote></div><br>Let's sort this out first... Might just be a terminology confusion.</div><div class="gmail_extra"><br></div><div class="gmail_extra">This shouldn't imply capturing in the strictest sense. It should just involve escape. For a more precise definition of what LLVM at least means by capture (unsure if this term is used elsewhere in the literature) see the comment at the top of the file: <a href="http://llvm.org/docs/doxygen/html/CaptureTracking_8cpp_source.html">http://llvm.org/docs/doxygen/html/CaptureTracking_8cpp_source.html</a></div><div class="gmail_extra"><br></div><div class="gmail_extra">Inherently, memory whose address isn't captured but is just escaped can't be used by another thread. So for nocapture pointers, I think we can eliminate *all* the instrumentation. Does that make sense?</div><div class="gmail_extra"><br></div><div class="gmail_extra">My current understanding is that for conservatively correct stuff we need to instrument the last write to memory prior to its first capture, and no writes (or reads) prior to that last write. Maybe just skipping escaped-but-nocapture would be enough to get most of the benefit here? Or maybe LLVM is missing really important cases that are actually nocapture?</div></div>