<div dir="ltr"><div>_mm_lfence was originally documented as a load fence. But in light of speculative execution vulnerabilities it has started being advertised as a way to prevent speculative execution. Current Intel Software Development Manual documents it as "Specifically, LFENCE does not execute until all prior instructions have completed locally, and no later instruction begins execution until LFENCE completes".</div><div><br></div><div>For the following test, my intention was to ensure that the body of either the if or the else would not proceed until any speculation of the branch had resolved. But SimplifyCFG saw that both control paths started with an lfence so hoisted it into a single lfence intrinsic before the branch. <a href="https://godbolt.org/z/qMc446">https://godbolt.org/z/qMc446</a>
The intrinsic in IR has no properties so it should be assumed to read/write any memory. But that's not enough to specify this control flow dependency. gcc also exhibits a similar behavior.</div><div><br></div><div><div style="color:rgb(0,0,0);background-color:rgb(255,255,254);font-family:"Consolas, ""><div><span style="color:rgb(0,0,255)">#include</span> <span style="color:rgb(0,0,255)"><</span><span style="color:rgb(163,21,21)">x86intrin.h</span><span style="color:rgb(0,0,255)">></span></div><br><div><span style="color:rgb(0,0,255)">void</span> bar();</div><div><span style="color:rgb(0,0,255)">void</span> baz();</div><br><div><span style="color:rgb(0,0,255)">void</span> foo(<span style="color:rgb(0,0,255)">int</span> c) {</div><div> <span style="color:rgb(0,0,255)">if</span> (c) {</div><div> _mm_lfence();</div><div> bar();</div><div> } <span style="color:rgb(0,0,255)">else</span> {</div><div> _mm_lfence(); </div><div> baz();</div><div> }</div><div>}</div><div><br></div><div><br></div><div>Alternatively, I also tried replacing the intrinsics with inline assembly. SimplifyCFG still merged those. But gcc did not. <a href="https://godbolt.org/z/acnPxY">https://godbolt.org/z/acnPxY</a></div><div><br></div><div><div><div><span style="color:rgb(0,0,255)">void</span> bar();</div><div><span style="color:rgb(0,0,255)">void</span> baz();</div><br><div><span style="color:rgb(0,0,255)">void</span> foo(<span style="color:rgb(0,0,255)">int</span> c) {</div><div> <span style="color:rgb(0,0,255)">if</span> (c) {</div><div> __asm__ __volatile (<span style="color:rgb(163,21,21)">"lfence"</span>);</div><div> bar();</div><div> } <span style="color:rgb(0,0,255)">else</span> {</div><div> __asm__ __volatile (<span style="color:rgb(163,21,21)">"lfence"</span>);</div><div> baz();</div><div> }</div><div>}</div></div></div><div><br></div><div>I believe the [[clang::nomerge]] attribute was recently extended to inline assembly which can be used to prevent the inline assembly from being hoisted by SimplifyCFG <a href="https://reviews.llvm.org/D84225">https://reviews.llvm.org/D84225</a> It also appears to work for intrinsic version, but I think its limited to C++ only.</div><div><br></div><div>Is there some existing property we can put on the intrinsic to prevent SimplifyCFG from hoisting like this? Are we more aggressive than we should be about hoisting inline assembly?</div><div><br></div><div>Thanks,</div></div></div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">~Craig</div></div></div>