<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">I reported my measurements in one of the early comments of the review thread: I only had slight improvements between 0.3 and 1.7% and two 0.5% regressions.</div><div class=""><br class=""></div><div class="">However this commit does not introduce more select instructions. The idea behind it is that for this code:</div><div class=""><div class=""><br class=""></div><div class="">long foo(long a, long b, long v1, long v2) {</div><div class="">  if (a >= v1 && a < v2)</div><div class="">    return b;</div><div class="">  return 0;</div><div class="">}</div></div><div class=""><br class=""></div><div class="">we used to generate:</div><div class=""><br class=""></div><div class=""><div class=""><span class="Apple-tab-span" style="white-space:pre">  </span>cmp<span class="Apple-tab-span" style="white-space:pre"> </span> x0, x2</div><div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>cset<span class="Apple-tab-span" style="white-space:pre">        </span> w8, ge</div><div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>cmp<span class="Apple-tab-span" style="white-space:pre"> </span> x0, x3</div><div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>cset<span class="Apple-tab-span" style="white-space:pre">        </span> w9, lt</div><div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>tst<span class="Apple-tab-span" style="white-space:pre"> </span> w8, w9</div><div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>csel<span class="Apple-tab-span" style="white-space:pre">        </span>x0, x1, xzr, ne</div><div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>ret</div></div><div class=""><br class=""></div><div class="">in another commit I changed the generic backend to use 2 selects instead:</div><div class=""><br class=""></div><div class=""><div class=""><span class="Apple-tab-span" style="white-space:pre">    </span>cmp<span class="Apple-tab-span" style="white-space:pre"> </span> x0, x2</div><div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>csel<span class="Apple-tab-span" style="white-space:pre">        </span>x8, x1, xzr, ge</div><div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>cmp<span class="Apple-tab-span" style="white-space:pre"> </span> x0, x3</div><div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>csel<span class="Apple-tab-span" style="white-space:pre">        </span>x0, x8, xzr, lt</div><div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>ret</div></div><div class=""><br class=""></div><div class="">in this commit I turned of the two select variant (by overriding shouldNormalizeToSelect()) so the improved cmp/ccmp matching can generate:</div><div class=""><br class=""></div><div class=""><span class="Apple-tab-span" style="white-space: pre;">      </span>cmp<span class="Apple-tab-span" style="white-space: pre;">       </span> x0, x2</div><div class=""><div class=""><span class="Apple-tab-span" style="white-space:pre">     </span>ccmp<span class="Apple-tab-span" style="white-space:pre">        </span>x0, x3, #0, ge</div><div class=""><span class="Apple-tab-span" style="white-space:pre">    </span>csel<span class="Apple-tab-span" style="white-space:pre">        </span>x0, x1, xzr, lt</div><div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>ret</div></div><div class=""><br class=""></div><div class="">In any way the number of select does not increase, if anything it should decrease because we are not normalizing towards the two select sequence anymore. The only thing I can think of that may lead to worse code is that the third sequence requires the cmp/ccmp to be scheduled pretty close to each other to not loose the flags, this could lead to increased register pressure as operand computations now happen before that, while for the 2 select version you could schedule the computations in between the two selects.</div><div class=""><br class=""></div><div class="">Anyway I think this would need a more in-depth analysis to really understand what is going on in your benchmark.</div><div class=""><br class=""></div><div class="">- Matthias</div><br class=""><div><blockquote type="cite" class=""><div class="">On Jun 3, 2015, at 6:59 AM, James Molloy <<a href="mailto:james@jamesmolloy.co.uk" class="">james@jamesmolloy.co.uk</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Hi Matthias,<br class=""><br class="">This actually caused a 10% regression in one of our tests on Cortex-A57 (but a 4% improvement on Cortex-A53). I think this is to do with selects being expensive on heavily out of order architectures. Did you notice any regressions on Typhoon/Cyclone?<div class=""><br class=""></div><div class="">It might be useful to have a heuristic determining if this is beneficial or not - if conversion in CGP already has this "isPredictableSelectExpensive()" hook - perhaps a similar one might be useful here?</div><div class=""><br class=""></div><div class="">Cheers,</div><div class=""><br class=""></div><div class="">James</div></div><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Mon, 1 Jun 2015 at 23:39 Phabricator <<a href="mailto:reviews@reviews.llvm.org" class="">reviews@reviews.llvm.org</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">REPOSITORY<br class="">

  rL LLVM<br class="">

<br class="">

<a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D8232&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=Gdqbblg081QXZ4Qg7OhixsHvneH_kDHGwUBZ8ZKdOTc&s=vje1lx_YnX_sc-CYJcXvlF3-Yt-dC4mR2IaDt_3uA4w&e=" target="_blank" class="">http://reviews.llvm.org/D8232</a><br class="">

<br class="">

Files:<br class="">

  llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp<br class="">

  llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h<br class="">

  llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td<br class="">

  llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td<br class="">

  llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll<br class="">

<br class="">

EMAIL PREFERENCES<br class="">

  <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_settings_panel_emailpreferences_&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=Gdqbblg081QXZ4Qg7OhixsHvneH_kDHGwUBZ8ZKdOTc&s=OtEecvfwv8xDM2A9oXbehLujvOjjYcttTI8WpYPmLzE&e=" target="_blank" class="">http://reviews.llvm.org/settings/panel/emailpreferences/</a><br class="">

_______________________________________________<br class="">

llvm-commits mailing list<br class="">

<a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank" class="">llvm-commits@cs.uiuc.edu</a><br class="">

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank" class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br class="">

</blockquote></div>

_______________________________________________<br class="">llvm-commits mailing list<br class=""><a href="mailto:llvm-commits@cs.uiuc.edu" class="">llvm-commits@cs.uiuc.edu</a><br class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits<br class=""></div></blockquote></div><br class=""></body></html>