<div dir="ltr">On Thu, May 9, 2013 at 6:16 AM, Nadav Rotem <span dir="ltr"><<a href="mailto:nrotem@apple.com" target="_blank">nrotem@apple.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div style="word-wrap:break-word"><div class="im"><br><div><div><br></div><blockquote type="cite"><div class="gmail_extra" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">

I'm not sure I understand the full impact of this example, and I would like to.</div><div class="gmail_extra" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">

<br></div><div class="gmail_extra" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">

What are the desired memory model semantics for a masked store? Specifically, let me suppose a simplified vector model of <2 x i64> on an i64-word-size platform.</div><div class="gmail_extra" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">

<br></div></blockquote><br></div></div><div>Hi Chandler, </div><div><br></div><div>I brought the example in this email thread to show that the optimizations that we currently have won't work on masked load/store operations because they don't take the mask into consideration. The memory model interesting question but I am not sure how it is related.</div>

</div></blockquote><div><br></div><div style>I understand why you originally brought up the example, but I'm trying to dig deeper. The reason why the memory model is related is because it drastically changes the set of options available for representing masked loads and stores. Masked loads and stores must participate in the memory model (just as all other loads and stores do), and we will need to update it to reflect them if they are added to the IR. I'm trying to understand the potential requirements of the actual code in question.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div> In our example you can see the problem with a single thread.</div></div>

</blockquote><div><br></div><div style>Sure, but any solution is going to have to solve both single thread problems, and concurrent problems. Also, the memory model isn't *only* about concurrent issues. It's a fundamental description of how loads and stores interact. That's why I presented the examples I did, and I would very much appreciate you responding to the entirety of my email.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div> Both MIC and AVX[1] have masked stores operations and they have a different memory model.</div>

</div></blockquote><div><br></div><div style>Ok, but you've only given me a link to one set of instructions, and it doesn't really have a proper description of the memory model. If I assume that this instruction's specification is the complete spec for AVX masked stores, then they are atomic stores (!!!) which means that they do not form a data race on *any* of the memory locations. This is, to say the least, a deeply surprising and restrictive model. Certainly, it seem too strong of a guarantee to enshrine as the *only* form of masked store in the IR.</div>

</div></div></div>