[X86][ExeDepsFix] Add broadcast instruction to the table used by ExeDepsFix pass

Nadav Rotem nrotem at apple.com
Tue Mar 25 16:37:45 PDT 2014


LGTM.  It makes sense that TSVC improved.


On Mar 25, 2014, at 4:34 PM, Quentin Colombet <qcolombet at apple.com> wrote:

> Hi Nadav,
> 
> Here is a patch that adds the different broadcast instructions to the ReplaceableInstrsAVX2 table.
> That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are across domain (int <-> float).
> 
> In particular, prior to this patch we were generating:
> 	vpbroadcastd	LCPI1_0(%rip), %ymm2
> 	vpand	%ymm2, %ymm0, %ymm0
> 	vmaxps	%ymm1, %ymm0, %ymm0 ## <- domain change penalty
> 
> Now, we generate the following nice sequence where everything is in the float domain:
> 	vbroadcastss	LCPI1_0(%rip), %ymm2
> 	vandps	%ymm2, %ymm0, %ymm0
> 	vmaxps	%ymm1, %ymm0, %ymm0
> 
> This patch gives a few speed-ups across the llvm test-suite + spec (-O3 -march=core-avx2).
> I have reported here only the tests that show a difference in the disassembly and that run for more than 1s to avoid noise.
> Benchmark_ID    	Reference	Test    	Speedup 	Percent
> -------------------------------------------------------------------------------
> ASCI_Purple/SMG2000/smg2000        	       1.5325	       1.5386	       1	    +0%
> CINT2000/181.mcf/181.mcf           	       3.7871	       3.7725	       1	    +0%
> CINT2006/401.bzip2/401.bzip2       	       1.5876	       1.5811	       1	    +0%
> CINT2006/456.hmmer/456.hmmer       	       1.9414	        1.941	       1	    +0%
> Misc/salsa20                       	       4.7333	       4.7319	       1	    +0%
> PAQ8p/paq8p                        	       28.187	      28.2113	       1	    +0%
> Polybench/linear-algebra/kernels/sy	      11.5787	      11.5834	       1	    +0%
> Polybench/linear-algebra/kernels/sy	       2.9281	       2.9271	       1	    +0%
> TSVC/ControlFlow-dbl/ControlFlow-db	        2.589	       2.5916	       1	    +0%
> TSVC/ControlFlow-flt/ControlFlow-fl	       2.2556	       2.1948	    1.03	    +3%
> TSVC/ControlLoops-flt/ControlLoops-	       1.6671	       1.6693	       1	    +0%
> TSVC/Equivalencing-dbl/Equivalencin	       1.5151	       1.4524	    1.04	    +4%
> TSVC/Expansion-dbl/Expansion-dbl   	        2.483	       2.4818	       1	    +0%
> TSVC/IndirectAddressing-flt/Indirec	       2.2959	       2.2353	    1.03	    +3%
> TSVC/InductionVariable-dbl/Inductio	       2.7742	       2.7731	       1	    +0%
> TSVC/LinearDependence-dbl/LinearDep	       2.2146	       2.2224	       1	    +0%
> TSVC/LinearDependence-flt/LinearDep	       1.5389	       1.5225	    1.01	    +1%
> TSVC/NodeSplitting-dbl/NodeSplittin	       2.4763	       2.4753	       1	    +0%
> TSVC/Packing-dbl/Packing-dbl       	        2.595	       2.5093	    1.03	    +3%
> TSVC/Packing-flt/Packing-flt       	       2.2906	       2.2796	       1	    +0%
> TSVC/Searching-dbl/Searching-dbl   	       2.9092	       2.8992	       1	    +0%
> TSVC/Searching-flt/Searching-flt   	        2.893	       2.8579	    1.01	    +1%
> TSVC/StatementReordering-dbl/Statem	       2.5175	       2.5113	       1	    +0%
> nbench/nbench                      	       5.2255	       5.2213	       1	    +0%
> sqlite3/sqlite3                    	        2.113	       2.1165	       1	    +0%
> -------------------------------------------------------------------------------
> Min (25)                           	            -	            -	       1	      -
> -------------------------------------------------------------------------------
> Max (25)                           	            -	            -	    1.04	      -
> -------------------------------------------------------------------------------
> Sum (25)                           	           99	           98	       1	    +0%
> -------------------------------------------------------------------------------
> A.Mean (25)                        	            -	            -	    1.01	    +1%
> -------------------------------------------------------------------------------
> G.Mean 2 (25)                      	            -	            -	    1.01	    +1%
> -------------------------------------------------------------------------------
> 
> Thanks,
> -Quentin
> <exefix-broadcast.patch>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140325/bb460b2e/attachment.html>


More information about the llvm-commits mailing list