[X86][ExeDepsFix] Add broadcast instruction to the table used by ExeDepsFix pass
Nadav Rotem
nrotem at apple.com
Tue Mar 25 16:37:45 PDT 2014
LGTM. It makes sense that TSVC improved.
On Mar 25, 2014, at 4:34 PM, Quentin Colombet <qcolombet at apple.com> wrote:
> Hi Nadav,
>
> Here is a patch that adds the different broadcast instructions to the ReplaceableInstrsAVX2 table.
> That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are across domain (int <-> float).
>
> In particular, prior to this patch we were generating:
> vpbroadcastd LCPI1_0(%rip), %ymm2
> vpand %ymm2, %ymm0, %ymm0
> vmaxps %ymm1, %ymm0, %ymm0 ## <- domain change penalty
>
> Now, we generate the following nice sequence where everything is in the float domain:
> vbroadcastss LCPI1_0(%rip), %ymm2
> vandps %ymm2, %ymm0, %ymm0
> vmaxps %ymm1, %ymm0, %ymm0
>
> This patch gives a few speed-ups across the llvm test-suite + spec (-O3 -march=core-avx2).
> I have reported here only the tests that show a difference in the disassembly and that run for more than 1s to avoid noise.
> Benchmark_ID Reference Test Speedup Percent
> -------------------------------------------------------------------------------
> ASCI_Purple/SMG2000/smg2000 1.5325 1.5386 1 +0%
> CINT2000/181.mcf/181.mcf 3.7871 3.7725 1 +0%
> CINT2006/401.bzip2/401.bzip2 1.5876 1.5811 1 +0%
> CINT2006/456.hmmer/456.hmmer 1.9414 1.941 1 +0%
> Misc/salsa20 4.7333 4.7319 1 +0%
> PAQ8p/paq8p 28.187 28.2113 1 +0%
> Polybench/linear-algebra/kernels/sy 11.5787 11.5834 1 +0%
> Polybench/linear-algebra/kernels/sy 2.9281 2.9271 1 +0%
> TSVC/ControlFlow-dbl/ControlFlow-db 2.589 2.5916 1 +0%
> TSVC/ControlFlow-flt/ControlFlow-fl 2.2556 2.1948 1.03 +3%
> TSVC/ControlLoops-flt/ControlLoops- 1.6671 1.6693 1 +0%
> TSVC/Equivalencing-dbl/Equivalencin 1.5151 1.4524 1.04 +4%
> TSVC/Expansion-dbl/Expansion-dbl 2.483 2.4818 1 +0%
> TSVC/IndirectAddressing-flt/Indirec 2.2959 2.2353 1.03 +3%
> TSVC/InductionVariable-dbl/Inductio 2.7742 2.7731 1 +0%
> TSVC/LinearDependence-dbl/LinearDep 2.2146 2.2224 1 +0%
> TSVC/LinearDependence-flt/LinearDep 1.5389 1.5225 1.01 +1%
> TSVC/NodeSplitting-dbl/NodeSplittin 2.4763 2.4753 1 +0%
> TSVC/Packing-dbl/Packing-dbl 2.595 2.5093 1.03 +3%
> TSVC/Packing-flt/Packing-flt 2.2906 2.2796 1 +0%
> TSVC/Searching-dbl/Searching-dbl 2.9092 2.8992 1 +0%
> TSVC/Searching-flt/Searching-flt 2.893 2.8579 1.01 +1%
> TSVC/StatementReordering-dbl/Statem 2.5175 2.5113 1 +0%
> nbench/nbench 5.2255 5.2213 1 +0%
> sqlite3/sqlite3 2.113 2.1165 1 +0%
> -------------------------------------------------------------------------------
> Min (25) - - 1 -
> -------------------------------------------------------------------------------
> Max (25) - - 1.04 -
> -------------------------------------------------------------------------------
> Sum (25) 99 98 1 +0%
> -------------------------------------------------------------------------------
> A.Mean (25) - - 1.01 +1%
> -------------------------------------------------------------------------------
> G.Mean 2 (25) - - 1.01 +1%
> -------------------------------------------------------------------------------
>
> Thanks,
> -Quentin
> <exefix-broadcast.patch>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140325/bb460b2e/attachment.html>
More information about the llvm-commits
mailing list