<div dir="ltr">Hi Matthias,<br><br>Sorry, but I've just bisected a failure in 403.gcc from SPEC2006 to this commit too.<div><br></div><div>It's a fairly large test, so getting a reduced testcase for you might be difficult. Do you have access to SPEC2006?</div><div><br></div><div>Cheers,</div><div><br></div><div>James</div></div><br><div class="gmail_quote"><div dir="ltr">On Wed, 3 Jun 2015 at 19:27 James Molloy <<a href="mailto:james@jamesmolloy.co.uk">james@jamesmolloy.co.uk</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi Matthias,<br><br><div>Thanks for the clarification - I'll take a further look.</div><div><br></div><div>Cheers,</div><div><br></div><div>James</div></div><br><div class="gmail_quote"><div dir="ltr">On Wed, 3 Jun 2015 at 17:21 Matthias Braun <<a href="mailto:matze@braunis.de" target="_blank">matze@braunis.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div>I reported my measurements in one of the early comments of the review thread: I only had slight improvements between 0.3 and 1.7% and two 0.5% regressions.</div><div><br></div><div>However this commit does not introduce more select instructions. The idea behind it is that for this code:</div><div><div><br></div><div>long foo(long a, long b, long v1, long v2) {</div><div>  if (a >= v1 && a < v2)</div><div>    return b;</div><div>  return 0;</div><div>}</div></div><div><br></div><div>we used to generate:</div><div><br></div><div><div><span style="white-space:pre-wrap">        </span>cmp<span style="white-space:pre-wrap">     </span> x0, x2</div><div><span style="white-space:pre-wrap">  </span>cset<span style="white-space:pre-wrap">    </span> w8, ge</div><div><span style="white-space:pre-wrap">  </span>cmp<span style="white-space:pre-wrap">     </span> x0, x3</div><div><span style="white-space:pre-wrap">  </span>cset<span style="white-space:pre-wrap">    </span> w9, lt</div><div><span style="white-space:pre-wrap">  </span>tst<span style="white-space:pre-wrap">     </span> w8, w9</div><div><span style="white-space:pre-wrap">  </span>csel<span style="white-space:pre-wrap">    </span>x0, x1, xzr, ne</div><div><span style="white-space:pre-wrap">  </span>ret</div></div><div><br></div><div>in another commit I changed the generic backend to use 2 selects instead:</div><div><br></div><div><div><span style="white-space:pre-wrap">     </span>cmp<span style="white-space:pre-wrap">     </span> x0, x2</div><div><span style="white-space:pre-wrap">  </span>csel<span style="white-space:pre-wrap">    </span>x8, x1, xzr, ge</div><div><span style="white-space:pre-wrap">  </span>cmp<span style="white-space:pre-wrap">     </span> x0, x3</div><div><span style="white-space:pre-wrap">  </span>csel<span style="white-space:pre-wrap">    </span>x0, x8, xzr, lt</div><div><span style="white-space:pre-wrap">  </span>ret</div></div><div><br></div><div>in this commit I turned of the two select variant (by overriding shouldNormalizeToSelect()) so the improved cmp/ccmp matching can generate:</div><div><br></div><div><span style="white-space:pre-wrap">      </span>cmp<span style="white-space:pre-wrap">     </span> x0, x2</div><div><div><span style="white-space:pre-wrap">       </span>ccmp<span style="white-space:pre-wrap">    </span>x0, x3, #0, ge</div><div><span style="white-space:pre-wrap">   </span>csel<span style="white-space:pre-wrap">    </span>x0, x1, xzr, lt</div><div><span style="white-space:pre-wrap">  </span>ret</div></div><div><br></div><div>In any way the number of select does not increase, if anything it should decrease because we are not normalizing towards the two select sequence anymore. The only thing I can think of that may lead to worse code is that the third sequence requires the cmp/ccmp to be scheduled pretty close to each other to not loose the flags, this could lead to increased register pressure as operand computations now happen before that, while for the 2 select version you could schedule the computations in between the two selects.</div><div><br></div><div>Anyway I think this would need a more in-depth analysis to really understand what is going on in your benchmark.</div></div><div style="word-wrap:break-word"><div><br></div><div>- Matthias</div></div><div style="word-wrap:break-word"><br><div><blockquote type="cite"><div>On Jun 3, 2015, at 6:59 AM, James Molloy <<a href="mailto:james@jamesmolloy.co.uk" target="_blank">james@jamesmolloy.co.uk</a>> wrote:</div><br><div><div dir="ltr">Hi Matthias,<br><br>This actually caused a 10% regression in one of our tests on Cortex-A57 (but a 4% improvement on Cortex-A53). I think this is to do with selects being expensive on heavily out of order architectures. Did you notice any regressions on Typhoon/Cyclone?<div><br></div><div>It might be useful to have a heuristic determining if this is beneficial or not - if conversion in CGP already has this "isPredictableSelectExpensive()" hook - perhaps a similar one might be useful here?</div><div><br></div><div>Cheers,</div><div><br></div><div>James</div></div><br><div class="gmail_quote"><div dir="ltr">On Mon, 1 Jun 2015 at 23:39 Phabricator <<a href="mailto:reviews@reviews.llvm.org" target="_blank">reviews@reviews.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">REPOSITORY<br>

  rL LLVM<br>

<br>

<a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D8232&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=Gdqbblg081QXZ4Qg7OhixsHvneH_kDHGwUBZ8ZKdOTc&s=vje1lx_YnX_sc-CYJcXvlF3-Yt-dC4mR2IaDt_3uA4w&e=" target="_blank">http://reviews.llvm.org/D8232</a><br>

<br>

Files:<br>

  llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp<br>

  llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h<br>

  llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td<br>

  llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td<br>

  llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll<br>

<br>

EMAIL PREFERENCES<br>

  <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_settings_panel_emailpreferences_&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=Gdqbblg081QXZ4Qg7OhixsHvneH_kDHGwUBZ8ZKdOTc&s=OtEecvfwv8xDM2A9oXbehLujvOjjYcttTI8WpYPmLzE&e=" target="_blank">http://reviews.llvm.org/settings/panel/emailpreferences/</a><br>

_______________________________________________<br>

llvm-commits mailing list<br>

<a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

</blockquote></div>

_______________________________________________<br>llvm-commits mailing list<br><a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br></div></blockquote></div><br></div></blockquote></div></blockquote></div>