<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><div>LGTM.  It makes sense that TSVC improved.</div><div><br></div><br><div><div>On Mar 25, 2014, at 4:34 PM, Quentin Colombet <<a href="mailto:qcolombet@apple.com">qcolombet@apple.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><meta http-equiv="Content-Type" content="text/html charset=us-ascii"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Hi Nadav,<div><br></div><div>Here is a patch that adds the different broadcast instructions to the <span style="font-family: Menlo; font-size: 11px;">ReplaceableInstrsAVX2 </span>table.</div><div>That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are across domain (int <-> float).</div><div><br></div><div>In particular, prior to this patch we were generating:</div><div><div><span class="Apple-tab-span" style="white-space:pre">        </span>vpbroadcastd<span class="Apple-tab-span" style="white-space:pre">        </span>LCPI1_0(%rip), %ymm2</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>vpand<span class="Apple-tab-span" style="white-space:pre">       </span>%ymm2, %ymm0, %ymm0</div><div><span class="Apple-tab-span" style="white-space:pre">  </span>vmaxps<span class="Apple-tab-span" style="white-space:pre">      </span>%ymm1, %ymm0, %ymm0 ## <- domain change penalty</div></div><div><br></div><div>Now, we generate the following nice sequence where everything is in the float domain:</div><div><div><span class="Apple-tab-span" style="white-space:pre">       </span>vbroadcastss<span class="Apple-tab-span" style="white-space:pre">        </span>LCPI1_0(%rip), %ymm2</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>vandps<span class="Apple-tab-span" style="white-space:pre">      </span>%ymm2, %ymm0, %ymm0</div><div><span class="Apple-tab-span" style="white-space:pre">  </span>vmaxps<span class="Apple-tab-span" style="white-space:pre">      </span>%ymm1, %ymm0, %ymm0</div></div><div><br></div><div>This patch gives a few speed-ups across the llvm test-suite + spec (-O3 -march=core-avx2).</div><div>I have reported here only the tests that show a difference in the disassembly and that run for more than 1s to avoid noise.</div><div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">Benchmark_ID    <span class="Apple-tab-span" style="white-space:pre"> </span>Reference<span class="Apple-tab-span" style="white-space:pre">   </span>Test    <span class="Apple-tab-span" style="white-space:pre">  </span>Speedup <span class="Apple-tab-span" style="white-space:pre">    </span>Percent</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">-------------------------------------------------------------------------------</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">ASCI_Purple/SMG2000/smg2000        <span class="Apple-tab-span" style="white-space:pre"> </span>       1.5325<span class="Apple-tab-span" style="white-space:pre">        </span>       1.5386<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">CINT2000/181.mcf/181.mcf           <span class="Apple-tab-span" style="white-space:pre">       </span>       3.7871<span class="Apple-tab-span" style="white-space:pre">        </span>       3.7725<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">CINT2006/401.bzip2/401.bzip2       <span class="Apple-tab-span" style="white-space:pre"> </span>       1.5876<span class="Apple-tab-span" style="white-space:pre">        </span>       1.5811<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">CINT2006/456.hmmer/456.hmmer       <span class="Apple-tab-span" style="white-space:pre"> </span>       1.9414<span class="Apple-tab-span" style="white-space:pre">        </span>        1.941<span class="Apple-tab-span" style="white-space:pre">   </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">Misc/salsa20                       <span class="Apple-tab-span" style="white-space:pre"> </span>       4.7333<span class="Apple-tab-span" style="white-space:pre">        </span>       4.7319<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">PAQ8p/paq8p                        <span class="Apple-tab-span" style="white-space:pre"> </span>       28.187<span class="Apple-tab-span" style="white-space:pre">        </span>      28.2113<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">Polybench/linear-algebra/kernels/sy<span class="Apple-tab-span" style="white-space:pre">     </span>      11.5787<span class="Apple-tab-span" style="white-space:pre">        </span>      11.5834<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">Polybench/linear-algebra/kernels/sy<span class="Apple-tab-span" style="white-space:pre">     </span>       2.9281<span class="Apple-tab-span" style="white-space:pre">        </span>       2.9271<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/ControlFlow-dbl/ControlFlow-db<span class="Apple-tab-span" style="white-space:pre">     </span>        2.589<span class="Apple-tab-span" style="white-space:pre">   </span>       2.5916<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/ControlFlow-flt/ControlFlow-fl<span class="Apple-tab-span" style="white-space:pre">     </span>       2.2556<span class="Apple-tab-span" style="white-space:pre">        </span>       2.1948<span class="Apple-tab-span" style="white-space:pre">        </span>    1.03<span class="Apple-tab-span" style="white-space:pre">  </span>    +3%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/ControlLoops-flt/ControlLoops-<span class="Apple-tab-span" style="white-space:pre">     </span>       1.6671<span class="Apple-tab-span" style="white-space:pre">        </span>       1.6693<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/Equivalencing-dbl/Equivalencin<span class="Apple-tab-span" style="white-space:pre">     </span>       1.5151<span class="Apple-tab-span" style="white-space:pre">        </span>       1.4524<span class="Apple-tab-span" style="white-space:pre">        </span>    1.04<span class="Apple-tab-span" style="white-space:pre">  </span>    +4%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/Expansion-dbl/Expansion-dbl   <span class="Apple-tab-span" style="white-space:pre">        </span>        2.483<span class="Apple-tab-span" style="white-space:pre">   </span>       2.4818<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/IndirectAddressing-flt/Indirec<span class="Apple-tab-span" style="white-space:pre">     </span>       2.2959<span class="Apple-tab-span" style="white-space:pre">        </span>       2.2353<span class="Apple-tab-span" style="white-space:pre">        </span>    1.03<span class="Apple-tab-span" style="white-space:pre">  </span>    +3%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/InductionVariable-dbl/Inductio<span class="Apple-tab-span" style="white-space:pre">     </span>       2.7742<span class="Apple-tab-span" style="white-space:pre">        </span>       2.7731<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/LinearDependence-dbl/LinearDep<span class="Apple-tab-span" style="white-space:pre">     </span>       2.2146<span class="Apple-tab-span" style="white-space:pre">        </span>       2.2224<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/LinearDependence-flt/LinearDep<span class="Apple-tab-span" style="white-space:pre">     </span>       1.5389<span class="Apple-tab-span" style="white-space:pre">        </span>       1.5225<span class="Apple-tab-span" style="white-space:pre">        </span>    1.01<span class="Apple-tab-span" style="white-space:pre">  </span>    +1%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/NodeSplitting-dbl/NodeSplittin<span class="Apple-tab-span" style="white-space:pre">     </span>       2.4763<span class="Apple-tab-span" style="white-space:pre">        </span>       2.4753<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/Packing-dbl/Packing-dbl       <span class="Apple-tab-span" style="white-space:pre">      </span>        2.595<span class="Apple-tab-span" style="white-space:pre">   </span>       2.5093<span class="Apple-tab-span" style="white-space:pre">        </span>    1.03<span class="Apple-tab-span" style="white-space:pre">  </span>    +3%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/Packing-flt/Packing-flt       <span class="Apple-tab-span" style="white-space:pre"> </span>       2.2906<span class="Apple-tab-span" style="white-space:pre">        </span>       2.2796<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/Searching-dbl/Searching-dbl   <span class="Apple-tab-span" style="white-space:pre">   </span>       2.9092<span class="Apple-tab-span" style="white-space:pre">        </span>       2.8992<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/Searching-flt/Searching-flt   <span class="Apple-tab-span" style="white-space:pre">        </span>        2.893<span class="Apple-tab-span" style="white-space:pre">   </span>       2.8579<span class="Apple-tab-span" style="white-space:pre">        </span>    1.01<span class="Apple-tab-span" style="white-space:pre">  </span>    +1%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">TSVC/StatementReordering-dbl/Statem<span class="Apple-tab-span" style="white-space:pre">     </span>       2.5175<span class="Apple-tab-span" style="white-space:pre">        </span>       2.5113<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">nbench/nbench                      <span class="Apple-tab-span" style="white-space:pre">      </span>       5.2255<span class="Apple-tab-span" style="white-space:pre">        </span>       5.2213<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">sqlite3/sqlite3                    <span class="Apple-tab-span" style="white-space:pre">   </span>        2.113<span class="Apple-tab-span" style="white-space:pre">   </span>       2.1165<span class="Apple-tab-span" style="white-space:pre">        </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">-------------------------------------------------------------------------------</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">Min (25)                           <span class="Apple-tab-span" style="white-space:pre">  </span>            -<span class="Apple-tab-span" style="white-space:pre"> </span>            -<span class="Apple-tab-span" style="white-space:pre"> </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>      -</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">-------------------------------------------------------------------------------</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">Max (25)                           <span class="Apple-tab-span" style="white-space:pre">     </span>            -<span class="Apple-tab-span" style="white-space:pre"> </span>            -<span class="Apple-tab-span" style="white-space:pre"> </span>    1.04<span class="Apple-tab-span" style="white-space:pre">  </span>      -</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">-------------------------------------------------------------------------------</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">Sum (25)                           <span class="Apple-tab-span" style="white-space:pre">        </span>           99<span class="Apple-tab-span" style="white-space:pre">      </span>           98<span class="Apple-tab-span" style="white-space:pre">      </span>       1<span class="Apple-tab-span" style="white-space:pre">     </span>    +0%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">-------------------------------------------------------------------------------</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">A.Mean (25)                        <span class="Apple-tab-span" style="white-space:pre">       </span>            -<span class="Apple-tab-span" style="white-space:pre"> </span>            -<span class="Apple-tab-span" style="white-space:pre"> </span>    1.01<span class="Apple-tab-span" style="white-space:pre">  </span>    +1%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">-------------------------------------------------------------------------------</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">G.Mean 2 (25)                      <span class="Apple-tab-span" style="white-space:pre">    </span>            -<span class="Apple-tab-span" style="white-space:pre"> </span>            -<span class="Apple-tab-span" style="white-space:pre"> </span>    1.01<span class="Apple-tab-span" style="white-space:pre">  </span>    +1%</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;">-------------------------------------------------------------------------------</div></div><div><br></div><div>Thanks,</div><div><div apple-content-edited="true">
<div style="font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">-Quentin</div>

</div>
</div></div><span><exefix-broadcast.patch></span><meta http-equiv="Content-Type" content="text/html charset=us-ascii"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><div></div></div></blockquote></div><br></body></html>