[X86][ExeDepsFix] Add broadcast instruction to the table used by ExeDepsFix pass
Quentin Colombet
qcolombet at apple.com
Tue Mar 25 17:17:14 PDT 2014
Thanks!
Committed revision 204770.
-Quentin
On Mar 25, 2014, at 4:37 PM, Nadav Rotem <nrotem at apple.com> wrote:
> LGTM. It makes sense that TSVC improved.
>
>
> On Mar 25, 2014, at 4:34 PM, Quentin Colombet <qcolombet at apple.com> wrote:
>
>> Hi Nadav,
>>
>> Here is a patch that adds the different broadcast instructions to the ReplaceableInstrsAVX2 table.
>> That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are across domain (int <-> float).
>>
>> In particular, prior to this patch we were generating:
>> vpbroadcastd LCPI1_0(%rip), %ymm2
>> vpand %ymm2, %ymm0, %ymm0
>> vmaxps %ymm1, %ymm0, %ymm0 ## <- domain change penalty
>>
>> Now, we generate the following nice sequence where everything is in the float domain:
>> vbroadcastss LCPI1_0(%rip), %ymm2
>> vandps %ymm2, %ymm0, %ymm0
>> vmaxps %ymm1, %ymm0, %ymm0
>>
>> This patch gives a few speed-ups across the llvm test-suite + spec (-O3 -march=core-avx2).
>> I have reported here only the tests that show a difference in the disassembly and that run for more than 1s to avoid noise.
>> Benchmark_ID Reference Test Speedup Percent
>> -------------------------------------------------------------------------------
>> ASCI_Purple/SMG2000/smg2000 1.5325 1.5386 1 +0%
>> CINT2000/181.mcf/181.mcf 3.7871 3.7725 1 +0%
>> CINT2006/401.bzip2/401.bzip2 1.5876 1.5811 1 +0%
>> CINT2006/456.hmmer/456.hmmer 1.9414 1.941 1 +0%
>> Misc/salsa20 4.7333 4.7319 1 +0%
>> PAQ8p/paq8p 28.187 28.2113 1 +0%
>> Polybench/linear-algebra/kernels/sy 11.5787 11.5834 1 +0%
>> Polybench/linear-algebra/kernels/sy 2.9281 2.9271 1 +0%
>> TSVC/ControlFlow-dbl/ControlFlow-db 2.589 2.5916 1 +0%
>> TSVC/ControlFlow-flt/ControlFlow-fl 2.2556 2.1948 1.03 +3%
>> TSVC/ControlLoops-flt/ControlLoops- 1.6671 1.6693 1 +0%
>> TSVC/Equivalencing-dbl/Equivalencin 1.5151 1.4524 1.04 +4%
>> TSVC/Expansion-dbl/Expansion-dbl 2.483 2.4818 1 +0%
>> TSVC/IndirectAddressing-flt/Indirec 2.2959 2.2353 1.03 +3%
>> TSVC/InductionVariable-dbl/Inductio 2.7742 2.7731 1 +0%
>> TSVC/LinearDependence-dbl/LinearDep 2.2146 2.2224 1 +0%
>> TSVC/LinearDependence-flt/LinearDep 1.5389 1.5225 1.01 +1%
>> TSVC/NodeSplitting-dbl/NodeSplittin 2.4763 2.4753 1 +0%
>> TSVC/Packing-dbl/Packing-dbl 2.595 2.5093 1.03 +3%
>> TSVC/Packing-flt/Packing-flt 2.2906 2.2796 1 +0%
>> TSVC/Searching-dbl/Searching-dbl 2.9092 2.8992 1 +0%
>> TSVC/Searching-flt/Searching-flt 2.893 2.8579 1.01 +1%
>> TSVC/StatementReordering-dbl/Statem 2.5175 2.5113 1 +0%
>> nbench/nbench 5.2255 5.2213 1 +0%
>> sqlite3/sqlite3 2.113 2.1165 1 +0%
>> -------------------------------------------------------------------------------
>> Min (25) - - 1 -
>> -------------------------------------------------------------------------------
>> Max (25) - - 1.04 -
>> -------------------------------------------------------------------------------
>> Sum (25) 99 98 1 +0%
>> -------------------------------------------------------------------------------
>> A.Mean (25) - - 1.01 +1%
>> -------------------------------------------------------------------------------
>> G.Mean 2 (25) - - 1.01 +1%
>> -------------------------------------------------------------------------------
>>
>> Thanks,
>> -Quentin
>> <exefix-broadcast.patch>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140325/1c55a9e6/attachment.html>
More information about the llvm-commits
mailing list