[PATCH] Improve DAG combine pass on certain IR vector patterns

Fri Jan 16 15:30:16 PST 2015

I checked Agner’s resources on Haswell to double-check and he says:

"However, there are fewer such delays on Haswell than on previous processors. I found no such delays in the following cases:
	•   when a floating point Boolean instruction, such as ORPS is used with integer data

	•   when a wrong type of move instruction is used, e.g. MOVPS or MOVDQA

	•   when a wrong type of shuffle instruction is used, e.g. SHUFPS or PFHUFD” 

So it looks like there shouldn’t be any cost for a movq/movhpd pairing, at least on Haswell.

Fiona