[X86][ExeDepsFix] Add broadcast instruction to the table used by ExeDepsFix pass

Quentin Colombet qcolombet at apple.com
Tue Mar 25 17:17:14 PDT 2014


Thanks!

Committed revision 204770.

-Quentin

On Mar 25, 2014, at 4:37 PM, Nadav Rotem <nrotem at apple.com> wrote:

> LGTM.  It makes sense that TSVC improved.
> 
> 
> On Mar 25, 2014, at 4:34 PM, Quentin Colombet <qcolombet at apple.com> wrote:
> 
>> Hi Nadav,
>> 
>> Here is a patch that adds the different broadcast instructions to the ReplaceableInstrsAVX2 table.
>> That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are across domain (int <-> float).
>> 
>> In particular, prior to this patch we were generating:
>> 	vpbroadcastd	LCPI1_0(%rip), %ymm2
>> 	vpand	%ymm2, %ymm0, %ymm0
>> 	vmaxps	%ymm1, %ymm0, %ymm0 ## <- domain change penalty
>> 
>> Now, we generate the following nice sequence where everything is in the float domain:
>> 	vbroadcastss	LCPI1_0(%rip), %ymm2
>> 	vandps	%ymm2, %ymm0, %ymm0
>> 	vmaxps	%ymm1, %ymm0, %ymm0
>> 
>> This patch gives a few speed-ups across the llvm test-suite + spec (-O3 -march=core-avx2).
>> I have reported here only the tests that show a difference in the disassembly and that run for more than 1s to avoid noise.
>> Benchmark_ID    	Reference	Test    	Speedup 	Percent
>> -------------------------------------------------------------------------------
>> ASCI_Purple/SMG2000/smg2000        	       1.5325	       1.5386	       1	    +0%
>> CINT2000/181.mcf/181.mcf           	       3.7871	       3.7725	       1	    +0%
>> CINT2006/401.bzip2/401.bzip2       	       1.5876	       1.5811	       1	    +0%
>> CINT2006/456.hmmer/456.hmmer       	       1.9414	        1.941	       1	    +0%
>> Misc/salsa20                       	       4.7333	       4.7319	       1	    +0%
>> PAQ8p/paq8p                        	       28.187	      28.2113	       1	    +0%
>> Polybench/linear-algebra/kernels/sy	      11.5787	      11.5834	       1	    +0%
>> Polybench/linear-algebra/kernels/sy	       2.9281	       2.9271	       1	    +0%
>> TSVC/ControlFlow-dbl/ControlFlow-db	        2.589	       2.5916	       1	    +0%
>> TSVC/ControlFlow-flt/ControlFlow-fl	       2.2556	       2.1948	    1.03	    +3%
>> TSVC/ControlLoops-flt/ControlLoops-	       1.6671	       1.6693	       1	    +0%
>> TSVC/Equivalencing-dbl/Equivalencin	       1.5151	       1.4524	    1.04	    +4%
>> TSVC/Expansion-dbl/Expansion-dbl   	        2.483	       2.4818	       1	    +0%
>> TSVC/IndirectAddressing-flt/Indirec	       2.2959	       2.2353	    1.03	    +3%
>> TSVC/InductionVariable-dbl/Inductio	       2.7742	       2.7731	       1	    +0%
>> TSVC/LinearDependence-dbl/LinearDep	       2.2146	       2.2224	       1	    +0%
>> TSVC/LinearDependence-flt/LinearDep	       1.5389	       1.5225	    1.01	    +1%
>> TSVC/NodeSplitting-dbl/NodeSplittin	       2.4763	       2.4753	       1	    +0%
>> TSVC/Packing-dbl/Packing-dbl       	        2.595	       2.5093	    1.03	    +3%
>> TSVC/Packing-flt/Packing-flt       	       2.2906	       2.2796	       1	    +0%
>> TSVC/Searching-dbl/Searching-dbl   	       2.9092	       2.8992	       1	    +0%
>> TSVC/Searching-flt/Searching-flt   	        2.893	       2.8579	    1.01	    +1%
>> TSVC/StatementReordering-dbl/Statem	       2.5175	       2.5113	       1	    +0%
>> nbench/nbench                      	       5.2255	       5.2213	       1	    +0%
>> sqlite3/sqlite3                    	        2.113	       2.1165	       1	    +0%
>> -------------------------------------------------------------------------------
>> Min (25)                           	            -	            -	       1	      -
>> -------------------------------------------------------------------------------
>> Max (25)                           	            -	            -	    1.04	      -
>> -------------------------------------------------------------------------------
>> Sum (25)                           	           99	           98	       1	    +0%
>> -------------------------------------------------------------------------------
>> A.Mean (25)                        	            -	            -	    1.01	    +1%
>> -------------------------------------------------------------------------------
>> G.Mean 2 (25)                      	            -	            -	    1.01	    +1%
>> -------------------------------------------------------------------------------
>> 
>> Thanks,
>> -Quentin
>> <exefix-broadcast.patch>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140325/1c55a9e6/attachment.html>


More information about the llvm-commits mailing list