<div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jan 26, 2015 at 9:00 AM, David Chisnall <span dir="ltr"><<a href="mailto:David.Chisnall@cl.cam.ac.uk" target="_blank">David.Chisnall@cl.cam.ac.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div id=":g6w" class="a3s" style="overflow:hidden">On 26 Jan 2015, at 16:45, <a href="mailto:sohachak@mpi-sws.org">sohachak@mpi-sws.org</a> wrote:<br>

><br>

> In the generated IR load(a) is independent of flag value which is not the<br>

> case in the source program. Hence there is an introduced race between<br>

> load(a) and store(a) operations when readA() and writeA() runs<br>

> concurrently.<br>

<br>

I believe that this is still not observably racy.  The load is not volatile which, in LLVM IR means that it is not allowed to have observable side effects.  The address of the load is a constant, so a side-effect-free load whose result is not used does not appear to be racy within the semantics of LLVM IR - it's just an expensive way of doing a nop.  It should materialise an undef value (if someone were to write a formal model of LLVM IR with full concurrency semantics), but a select where the unused value is undef is still well defined in LLVM IR.</div></blockquote></div><br>FWIW, the C and C++ memory models were design specifically to allow load speculation, and the LLVM memory model was based heavily on them and specifically allows load speculation.</div><div class="gmail_extra"><br></div><div class="gmail_extra">As with any speculation, a number of invariants must hold that prove it to be safe, and there are plenty of circumstances where it isn't profitable, but the memory model is design to allow it.</div><div class="gmail_extra"><br></div><div class="gmail_extra"><br></div><div class="gmail_extra">Note that race detection tools typically turn these parts of the optimizer off. You can see this in several places where we disable load speculation or load widening when running under TSan specifically so that it can enforce a more conservative model.</div></div>