<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Dec 11, 2015 at 3:22 AM, Philip Reames via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Currently, we limit atomic loads and stores to either pointer or integer types.  I would like to propose that we extend this to allow both floating point and vector types which meet the other requirements.  (i.e. power-of-two multiple of 8 bits, and aligned)<br></blockquote><div><br></div><div>I support this.</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

This will enable a couple of follow on changes:<br>

1) Teaching the vectorizer how to vectorize unordered atomic loads and stores<br>

2) Removing special casing around type canonicalization of loads in various passes<br>

3) Removing complexity from language frontends which need to support atomic operations on floating point types.<br></blockquote><div><br></div><div>This may become relevant for C++, see <a href="http://wg21.link/p0020r0">http://wg21.link/p0020r0</a></div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

My initial implementation plan will not require any changes from the backends.  I plan to add a lowering step to the existing AtomicExpandPass which will convert atomic operations on floats and vectors to their equivalently sized integer counterparts.  Over time, individual backends will be able to opt in - via a TTI hook - to have the new types of atomics handled by the normal isel machinery.<br>

<br>

I've prototyped this approach with the x86 backend and get what looks like correct and even fairly efficient instruction selection taking place.  I haven't studied it too extensively, so it might not work out in the end, but the approach appears generally feasible.<br></blockquote><div><br></div><div>Simpler path that <a href="http://reviews.llvm.org/D11382">D11382</a>, generating the same code?</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

One open question I don't know the answer to: Are there any special semantics required from floating point stores which aren't met by simply bitcasting their result to i32 (float) or i64 (double) and storing the result?  In particular, I'm unsure of the semantics around canonicalization here.  Are there any? Same for loads?<br></blockquote><div><br></div><div>I'd go a bit further: should you also support basic FP operations atomically? The above C++ paper adds add/sub, and we've discussed adding FMA as well.</div><div><br></div><div>This raises similar issues around FP exceptions (are they impl-defined, UB, unspecified? Do we care?).</div></div></div></div>