<html>
<head>
</head>
<body class='hmmessage'><div dir='ltr'>

<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 12pt;
font-family:Calibri
}
--></style>
<div dir="ltr">Hi,<div><br></div><div>I have encountered an issue which seems to be a serious reproducible bug in LLVM-GCC 4.2. </div><div>It can be reproduced by compiling the following C++ file that uses boost:</div><div><p style="font-size: 11px; font-family: Menlo; color: rgb(202, 44, 34); "><span style="color: #75492d">#include </span>"boost/statechart/event.hpp"</p>
<p style="font-size: 11px; font-family: Menlo; min-height: 13px; "><br></p>
<p style="font-size: 11px; font-family: Menlo; color: rgb(182, 30, 161); ">using<span style="color: #000000"> </span>namespace<span style="color: #000000"> </span><span style="color: #7135a7">std</span><span style="color: #000000">;</span></p>
<p style="font-size: 11px; font-family: Menlo; min-height: 13px; "><br></p>
<p style="font-size: 11px; font-family: Menlo; "><span style="color: #b61ea1">class</span> EvActivate : <span style="color: #b61ea1">public</span> <span style="color: #7135a7">boost</span>::<span style="color: #7135a7">statechart</span>::<span style="color: #7135a7">event</span>< <span style="color: #548187">EvActivate</span> ></p>
<p style="font-size: 11px; font-family: Menlo; ">{</p>
<p style="font-size: 11px; font-family: Menlo; color: rgb(182, 30, 161); ">public<span style="color: #000000">:</span></p>
<p style="font-size: 11px; font-family: Menlo; ">    EvActivate(){}</p>
<p style="font-size: 11px; font-family: Menlo; min-height: 13px; ">    </p>
<p style="font-size: 11px; font-family: Menlo; color: rgb(182, 30, 161); ">private<span style="color: #000000">:</span></p>
<p style="font-size: 11px; font-family: Menlo; ">};</p>
<p style="font-size: 11px; font-family: Menlo; min-height: 13px; "><br></p>
<p style="font-size: 11px; font-family: Menlo; "><span style="color: #b61ea1">extern</span> <span style="color: #ca2c22">"C"</span> <span style="color: #b61ea1">const</span> <span style="color: #b61ea1">void</span>* activate()</p>
<p style="font-size: 11px; font-family: Menlo; ">{</p>
<p style="font-size: 11px; font-family: Menlo; color: rgb(64, 18, 128); "><span style="color: #000000">    </span><span style="color: #b61ea1">return</span><span style="color: #000000"> (</span><span style="color: #548187">EvActivate</span><span style="color: #000000">()).</span>intrusive_from_this<span style="color: #000000">().</span>get<span style="color: #000000">();</span></p>
<p style="font-size: 11px; font-family: Menlo; ">}</p></div><div><br></div><div>The problem is that the generated assembler looks like:</div><div><div>_activate:</div><div>00000000<span class="Apple-tab-span" style="white-space:pre">   </span>    b5f0<span class="Apple-tab-span" style="white-space:pre">  </span>push<span class="Apple-tab-span" style="white-space:pre">        </span>{r4, r5, r6, r7, lr}</div><div>00000002<span class="Apple-tab-span" style="white-space:pre"> </span>    af03<span class="Apple-tab-span" style="white-space:pre">  </span>add<span class="Apple-tab-span" style="white-space:pre"> </span>r7, sp, #12</div><div>00000004<span class="Apple-tab-span" style="white-space:pre">  </span>e92d0d00<span class="Apple-tab-span" style="white-space:pre">    </span>stmdb<span class="Apple-tab-span" style="white-space:pre">       </span>sp!, {r8, sl, fp}</div><div>00000008<span class="Apple-tab-span" style="white-space:pre">    </span>ed2d8b10<span class="Apple-tab-span" style="white-space:pre">    </span>vstmdb<span class="Apple-tab-span" style="white-space:pre">      </span>sp!, {d8-d15}</div><div>0000000c<span class="Apple-tab-span" style="white-space:pre">        </span>    b094<span class="Apple-tab-span" style="white-space:pre">  </span>sub<span class="Apple-tab-span" style="white-space:pre"> </span>sp, #80</div><div>0000000e<span class="Apple-tab-span" style="white-space:pre">      </span>f2405088<span class="Apple-tab-span" style="white-space:pre">    </span>movw<span class="Apple-tab-span" style="white-space:pre">        </span>r0, :lower16:__ZN5boost10statechart6detail9id_holderI10EvActivateE11idProvider_E-0x24+0xfffffffc</div><div>00000012<span class="Apple-tab-span" style="white-space:pre">     </span>    2300<span class="Apple-tab-span" style="white-space:pre">  </span>movs<span class="Apple-tab-span" style="white-space:pre">        </span>r3, #0</div><div>00000014<span class="Apple-tab-span" style="white-space:pre">       </span>f2c00000<span class="Apple-tab-span" style="white-space:pre">    </span>movt<span class="Apple-tab-span" style="white-space:pre">        </span>r0, :upper16:__ZN5boost10statechart6detail9id_holderI10EvActivateE11idProvider_E-0x24+0xfffffffc</div><div>00000018<span class="Apple-tab-span" style="white-space:pre">     </span>f2407140<span class="Apple-tab-span" style="white-space:pre">    </span>movw<span class="Apple-tab-span" style="white-space:pre">        </span>r1, :lower16:0x770-0x2c+0xfffffffc</div><div>0000001c<span class="Apple-tab-span" style="white-space:pre">   </span>f2c00100<span class="Apple-tab-span" style="white-space:pre">    </span>movt<span class="Apple-tab-span" style="white-space:pre">        </span>r1, :upper16:0x770-0x2c+0xfffffffc</div><div>00000020<span class="Apple-tab-span" style="white-space:pre">   </span>f24052c8<span class="Apple-tab-span" style="white-space:pre">    </span>movw<span class="Apple-tab-span" style="white-space:pre">        </span>r2, :lower16:__ZTV10EvActivate-0x34+0xfffffffc</div><div>00000024<span class="Apple-tab-span" style="white-space:pre">       </span>    4478<span class="Apple-tab-span" style="white-space:pre">  </span>add<span class="Apple-tab-span" style="white-space:pre"> </span>r0, pc</div><div>00000026<span class="Apple-tab-span" style="white-space:pre">       </span>f2c00200<span class="Apple-tab-span" style="white-space:pre">    </span>movt<span class="Apple-tab-span" style="white-space:pre">        </span>r2, :upper16:__ZTV10EvActivate-0x34+0xfffffffc</div><div>0000002a<span class="Apple-tab-span" style="white-space:pre">       </span>    9304<span class="Apple-tab-span" style="white-space:pre">  </span>str<span class="Apple-tab-span" style="white-space:pre"> </span>r3, [sp, #16]</div><div>0000002c<span class="Apple-tab-span" style="white-space:pre">        </span>    4479<span class="Apple-tab-span" style="white-space:pre">  </span>add<span class="Apple-tab-span" style="white-space:pre"> </span>r1, pc</div><div>0000002e<span class="Apple-tab-span" style="white-space:pre">       </span>    9005<span class="Apple-tab-span" style="white-space:pre">  </span>str<span class="Apple-tab-span" style="white-space:pre"> </span>r0, [sp, #20]</div><div>00000030<span class="Apple-tab-span" style="white-space:pre">        </span>    a803<span class="Apple-tab-span" style="white-space:pre">  </span>add<span class="Apple-tab-span" style="white-space:pre"> </span>r0, sp, #12</div><div>00000032<span class="Apple-tab-span" style="white-space:pre">  </span>    9006<span class="Apple-tab-span" style="white-space:pre">  </span>str<span class="Apple-tab-span" style="white-space:pre"> </span>r0, [sp, #24]</div><div>00000034<span class="Apple-tab-span" style="white-space:pre">        </span>    447a<span class="Apple-tab-span" style="white-space:pre">  </span>add<span class="Apple-tab-span" style="white-space:pre"> </span>r2, pc</div><div>00000036<span class="Apple-tab-span" style="white-space:pre">       </span>f8ddc018<span class="Apple-tab-span" style="white-space:pre">    </span>ldr.w<span class="Apple-tab-span" style="white-space:pre">       </span>ip, [sp, #24]</div><div>0000003a<span class="Apple-tab-span" style="white-space:pre">        </span>    3004<span class="Apple-tab-span" style="white-space:pre">  </span>adds<span class="Apple-tab-span" style="white-space:pre">        </span>r0, #4</div><div>0000003c<span class="Apple-tab-span" style="white-space:pre">       </span>    9001<span class="Apple-tab-span" style="white-space:pre">  </span>str<span class="Apple-tab-span" style="white-space:pre"> </span>r0, [sp, #4]</div><div>0000003e<span class="Apple-tab-span" style="white-space:pre"> </span>    6808<span class="Apple-tab-span" style="white-space:pre">  </span>ldr<span class="Apple-tab-span" style="white-space:pre"> </span>r0, [r1, #0]</div><div>00000040<span class="Apple-tab-span" style="white-space:pre"> </span>f1020108<span class="Apple-tab-span" style="white-space:pre">    </span>add.w<span class="Apple-tab-span" style="white-space:pre">       </span>r1, r2, #8<span class="Apple-tab-span" style="white-space:pre">  </span>@ 0x8</div><div>00000044<span class="Apple-tab-span" style="white-space:pre">        </span>f8cc1000<span class="Apple-tab-span" style="white-space:pre">    </span>str.w<span class="Apple-tab-span" style="white-space:pre">       </span>r1, [ip]</div><div>00000048<span class="Apple-tab-span" style="white-space:pre">     </span>f3bf8f5a<span class="Apple-tab-span" style="white-space:pre">    </span>dmb<span class="Apple-tab-span" style="white-space:pre"> </span>ishst</div><div>0000004c<span class="Apple-tab-span" style="white-space:pre">        </span>    9901<span class="Apple-tab-span" style="white-space:pre">  </span>ldr<span class="Apple-tab-span" style="white-space:pre"> </span>r1, [sp, #4]</div><div><b>0000004e<span class="Apple-tab-span" style="white-space:pre">        </span>e8512f00<span class="Apple-tab-span" style="white-space:pre">    </span>ldrex<span class="Apple-tab-span" style="white-space:pre">       </span>r2, [r1]</b></div><div><b>00000052<span class="Apple-tab-span" style="white-space:pre">  </span>    9200<span class="Apple-tab-span" style="white-space:pre">  </span>str<span class="Apple-tab-span" style="white-space:pre"> </span>r2, [sp, #0]</b></div><div><b>00000054<span class="Apple-tab-span" style="white-space:pre">      </span>    441a<span class="Apple-tab-span" style="white-space:pre">  </span>add<span class="Apple-tab-span" style="white-space:pre"> </span>r2, r3</b></div><div><b>00000056<span class="Apple-tab-span" style="white-space:pre">    </span>e8412c00<span class="Apple-tab-span" style="white-space:pre">    </span>strex<span class="Apple-tab-span" style="white-space:pre">       </span>ip, r2, [r1]</b></div><div><b>0000005a<span class="Apple-tab-span" style="white-space:pre">      </span>f1bc0f00<span class="Apple-tab-span" style="white-space:pre">    </span>cmp.w<span class="Apple-tab-span" style="white-space:pre">       </span>ip, #0<span class="Apple-tab-span" style="white-space:pre">      </span>@ 0x0</b></div><div><b>0000005e<span class="Apple-tab-span" style="white-space:pre">     </span>    d1f6<span class="Apple-tab-span" style="white-space:pre">  </span>bne.n<span class="Apple-tab-span" style="white-space:pre">       </span>0x4e</b></div></div><div>...</div><div><br></div><div>What happens in the code between 4e and 5e is an atomic check of a variable by the inlined <span style="font-family: Arial, sans-serif; font-size: 11px; ">__exchange_and_add</span><span style="font-size: 12pt; ">. The problem is that the result read by ldrex is stored by the inline optimization on the stack for further use. However, as the atomically read variable is also on the stack and resides very close to this compiler-induced intermediate storage - the write hits the </span><a href="http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dht0008a/ch01s02s01.html" style="font-size: 12pt; ">ERG</a><span style="font-size: 12pt; ">. On Apple's A6X devices this reproduced consistently - the code entered a perpetual loop, as the str instruction at 0x52 caused the srtex at 0x56 to always fail and always return 1 and the following branch started it all over. </span></div><div><br></div><div>Generating such code violates the ARM <a href="http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dht0008a/ch01s02s01.html">recommendation</a>:</div><div>"<span style="font-family: Verdana, Tahoma, Arial, Helvetica, sans-serif; font-size: 10pt; ">For these reasons ARM recommends that:</span></div><div class="itemizedlist" style="font-family: Verdana, Tahoma, Arial, Helvetica, sans-serif; font-size: small; "><ul type="disc" compact="compact" style="margin-top: 0.4em; margin-bottom: 0.2em; "><li style="margin-top: 0.3em; margin-bottom: 0.2em; "><p style="margin-top: 0.3em; margin-bottom: 0.2em; ">the Load-Exclusive and Store-Exclusive are no more than 128 bytes apart</p></li><li style="margin-top: 0.3em; margin-bottom: 0.2em; "><p style="margin-top: 0.3em; margin-bottom: 0.2em; ">no explicit cache maintenance operations or data accesses are performed between the Load-Exclusive and the Store-Exclusive.<span style="font-family: Calibri, sans-serif; font-size: 12pt; ">"</span></p><p style="margin-top: 0.3em; margin-bottom: 0.2em; "><span style="font-family: Calibri, sans-serif; font-size: 12pt; "><br></span></p><p style="margin-top: 0.3em; margin-bottom: 0.2em; "><span style="font-family: Calibri, sans-serif; font-size: 12pt; ">I've encountered this issue in a real code and would be glad to get the feedback on it. Please let me know if I need to submit a bug somewhere to get it resolved. I've found out that clang does not have this problem.</span></p><p style="margin-top: 0.3em; margin-bottom: 0.2em; "><span style="font-family: Calibri, sans-serif; font-size: 12pt; "><br></span></p><p style="margin-top: 0.3em; margin-bottom: 0.2em; "><span style="font-family: Calibri, sans-serif; font-size: 12pt; "><br></span></p><p style="margin-top: 0.3em; margin-bottom: 0.2em; "><span style="font-family: Calibri, sans-serif; font-size: 12pt; ">Moshe</span></p></li></ul></div></div>
                                          </div></body>
</html>