<br> <br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
- I have read the spec and my conclusion is that a barrier is a work-group syncpoint, whatever are the flags. So I think that we must have a barrier nofence() call.<br>
<br></blockquote><div>I would agree, though the spec is ambiguous. I would make it fence all address spaces as the fallback else case for a non compile time constant (though I remember finding that was not allowed, though I've never re-found where in the spec that is specified. It should be a frontend warning anyway)<span></span></div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
- For the localglobal() stuff used everywhere, it is used to mimic how the closed driver seems to do. In their IR output we can see that they have chosen to use different pseudo-instructions for all the possibilities: barriers and memory fences seem to have different intrinsics according to the different flags and all.</blockquote>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"></blockquote><div><br></div><div>This is because in AMDIL the same fence instruction with different modifiers implements all of the variations of barrier and mem_fence. LLVM is not aware of the hardware details of how it works and does not do any real scheduling</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> So I thought that maybe, it would be intereseting to do the same.<br>
Thanks to that, it is really easy to lower correctly intrinsics, and we have no change to do if someday some hardware has a special instruction for every combination (very irealistic however).<br>
But I can change that if you want.<br>
<br>
- I have considered making a very simple implementation of barriers with a call to mem_fence and the actual barrier intrinsic. But the close driver have special intrinsics so... ^^</blockquote><div><br></div><div>As mentioned in the LLVM thread, barrier can't be used to implement a mem_fence </div>