[X86][ExeDepsFix] Add broadcast instruction to the table used by ExeDepsFix pass

Quentin Colombet qcolombet at apple.com
Tue Mar 25 16:34:38 PDT 2014


Hi Nadav,

Here is a patch that adds the different broadcast instructions to the ReplaceableInstrsAVX2 table.
That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are across domain (int <-> float).

In particular, prior to this patch we were generating:
	vpbroadcastd	LCPI1_0(%rip), %ymm2
	vpand	%ymm2, %ymm0, %ymm0
	vmaxps	%ymm1, %ymm0, %ymm0 ## <- domain change penalty

Now, we generate the following nice sequence where everything is in the float domain:
	vbroadcastss	LCPI1_0(%rip), %ymm2
	vandps	%ymm2, %ymm0, %ymm0
	vmaxps	%ymm1, %ymm0, %ymm0

This patch gives a few speed-ups across the llvm test-suite + spec (-O3 -march=core-avx2).
I have reported here only the tests that show a difference in the disassembly and that run for more than 1s to avoid noise.
Benchmark_ID    	Reference	Test    	Speedup 	Percent
-------------------------------------------------------------------------------
ASCI_Purple/SMG2000/smg2000        	       1.5325	       1.5386	       1	    +0%
CINT2000/181.mcf/181.mcf           	       3.7871	       3.7725	       1	    +0%
CINT2006/401.bzip2/401.bzip2       	       1.5876	       1.5811	       1	    +0%
CINT2006/456.hmmer/456.hmmer       	       1.9414	        1.941	       1	    +0%
Misc/salsa20                       	       4.7333	       4.7319	       1	    +0%
PAQ8p/paq8p                        	       28.187	      28.2113	       1	    +0%
Polybench/linear-algebra/kernels/sy	      11.5787	      11.5834	       1	    +0%
Polybench/linear-algebra/kernels/sy	       2.9281	       2.9271	       1	    +0%
TSVC/ControlFlow-dbl/ControlFlow-db	        2.589	       2.5916	       1	    +0%
TSVC/ControlFlow-flt/ControlFlow-fl	       2.2556	       2.1948	    1.03	    +3%
TSVC/ControlLoops-flt/ControlLoops-	       1.6671	       1.6693	       1	    +0%
TSVC/Equivalencing-dbl/Equivalencin	       1.5151	       1.4524	    1.04	    +4%
TSVC/Expansion-dbl/Expansion-dbl   	        2.483	       2.4818	       1	    +0%
TSVC/IndirectAddressing-flt/Indirec	       2.2959	       2.2353	    1.03	    +3%
TSVC/InductionVariable-dbl/Inductio	       2.7742	       2.7731	       1	    +0%
TSVC/LinearDependence-dbl/LinearDep	       2.2146	       2.2224	       1	    +0%
TSVC/LinearDependence-flt/LinearDep	       1.5389	       1.5225	    1.01	    +1%
TSVC/NodeSplitting-dbl/NodeSplittin	       2.4763	       2.4753	       1	    +0%
TSVC/Packing-dbl/Packing-dbl       	        2.595	       2.5093	    1.03	    +3%
TSVC/Packing-flt/Packing-flt       	       2.2906	       2.2796	       1	    +0%
TSVC/Searching-dbl/Searching-dbl   	       2.9092	       2.8992	       1	    +0%
TSVC/Searching-flt/Searching-flt   	        2.893	       2.8579	    1.01	    +1%
TSVC/StatementReordering-dbl/Statem	       2.5175	       2.5113	       1	    +0%
nbench/nbench                      	       5.2255	       5.2213	       1	    +0%
sqlite3/sqlite3                    	        2.113	       2.1165	       1	    +0%
-------------------------------------------------------------------------------
Min (25)                           	            -	            -	       1	      -
-------------------------------------------------------------------------------
Max (25)                           	            -	            -	    1.04	      -
-------------------------------------------------------------------------------
Sum (25)                           	           99	           98	       1	    +0%
-------------------------------------------------------------------------------
A.Mean (25)                        	            -	            -	    1.01	    +1%
-------------------------------------------------------------------------------
G.Mean 2 (25)                      	            -	            -	    1.01	    +1%
-------------------------------------------------------------------------------

Thanks,
-Quentin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140325/c70156f3/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: exefix-broadcast.patch
Type: application/octet-stream
Size: 9578 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140325/c70156f3/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140325/c70156f3/attachment-0001.html>


More information about the llvm-commits mailing list