<div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, May 9, 2013 at 6:04 PM,  <span dir="ltr"><<a href="mailto:dag@cray.com" target="_blank">dag@cray.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<div class="im">Dan Gohman <<a href="mailto:dan433584@gmail.com">dan433584@gmail.com</a>> writes:<br>

<br>

>     But I don't understand why defining this as not being a data race<br>

>     would complicate things. I'm assuming the mask values are<br>

>     statically known.  Can you explain a bit more?<br>

><br>

> It's an interesting question for autovectorization, for example.<br>

><br>

> Thread A:<br>

> for (i=0;i<n;++i)<br>

> if (i&1)<br>

> X[i] = 0;<br>

><br>

> Thread B:<br>

> for (i=0;i<n;++i)<br>

> if (!(i&1))<br>

> X[i] = 1;<br>

><br>

> The threads run concurrently without synchronization. As written,<br>

> there is no race.<br>

<br>

</div>There is no race *if* the hardware cache coherence says so.  :) There<br>

are false sharing issues here and different machines have behaved very<br>

differently in the past.<br></blockquote><div><br></div><div style>Let's not conflate races with false sharing. They're totally different, and false sharing is *not* what we're discussing here.</div><div> </div>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">The result entirely depends on the machine's consistency model.<br>


<br>

LLVM is a virtual machine and the IR should define a consistency model.<br>

Everything flows from that.  I think ideally we'd define the model such<br>

that there is no race in the scalar code and the compiler would be free<br>

to vectorize it.  This is a very strict consistency model and for<br>

targets with relaxed semantics, LLVM would have to insert<br>

synchronization operations or choose not to vectorize.</blockquote></div><br>LLVM already has a memory model. We don't need to add one. ;] It's here for reference: <a href="http://llvm.org/docs/LangRef.html#memmodel">http://llvm.org/docs/LangRef.html#memmodel</a></div>

<div class="gmail_extra"><br></div><div class="gmail_extra" style>Also, cache coherency is *not* the right way to think of a memory model. It makes it extremely hard to understand and define what optimization passes are allowed to do. I think LLVM's memory model does a very good job of this for both scalar and vector code today. If you spot problems with it, let's start a thread to address them. I suspect myself, Jeffrey, and Owen will all be extremely interested in discussing any such issues.</div>

<div class="gmail_extra" style><br></div><div class="gmail_extra" style>The only thing that isn't in the model that is relevant here is something that isn't in LLVM today -- masked loads and stores. And that was what inspired my original question. =D</div>

</div>