Please review: Optimize vector multiply on X86

Demikhovsky, Elena elena.demikhovsky at intel.com
Thu Jun 20 02:34:11 PDT 2013


Hi Andrea,

You are right that 'isSplatVector' returns true if elements are not the same. But later I check the slat value for 0, -1, pow2
And in the tests I check everything:
 
==== Multiple 2 gives add
+; CHECK: mul_const1
+; CHECK: vpaddd
+; CHECK: ret
+define <8 x i32> @mul_const1(<8 x i32> %x) {
+  %y = mul <8 x i32> %x, <i32 2, i32 2, i32 2, i32 2, i32 2, i32 2, i32 2, i32 2>
+  ret <8 x i32> %y
+}
+

=== Multiple 4 gives shift left
+; CHECK: mul_const2
+; CHECK: vpsllq  $2
+; CHECK: ret
+define <4 x i64> @mul_const2(<4 x i64> %x) {
+  %y = mul <4 x i64> %x, <i64 4, i64 4, i64 4, i64 4>
+  ret <4 x i64> %y
+}
+

=== multiple 0 gives vxorps
+
+; CHECK: mul_const5
+; CHECK: vxorps
+; CHECK-NEXT: ret
+define <8 x i32> @mul_const5(<8 x i32> %x) {
+  %y = mul <8 x i32> %x, <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
+  ret <8 x i32> %y
+}
+

=== non-spalt int value is not optimized

+; CHECK: mul_const6
+; CHECK: vpmulld
+; CHECK: ret
+define <8 x i32> @mul_const6(<8 x i32> %x) {
+  %y = mul <8 x i32> %x, <i32 0, i32 0, i32 0, i32 2, i32 0, i32 2, i32 0, i32 0>
+  ret <8 x i32> %y
+}

-  Elena


-----Original Message-----
From: Andrea_DiBiagio at sn.scee.net [mailto:Andrea_DiBiagio at sn.scee.net] 
Sent: Wednesday, June 19, 2013 20:09
To: Demikhovsky, Elena
Cc: Benjamin Kramer; llvm-commits at cs.uiuc.edu; llvm-commits-bounces at cs.uiuc.edu
Subject: RE: Please review: Optimize vector multiply on X86

Hi Elena,

> From: "Demikhovsky, Elena" <elena.demikhovsky at intel.com> I did the MUL 
> optimization common for all targets. Please review.

Your function 'isSplatVector' returns true also when elements are not all the same.
Example: a vector of four i64 defined as <i64 4, i64 0, i64 0, i64 0> is still considered to be a valid splat with a SplatBitSize equale to the vector size.

With your patch, if I run
llc -mtriple=x86_64-apple-darwin -mcpu=core-avx2 -mattr=+avx2 on the following function:

define <8 x i32> @mul_const1(<8 x i32> %x) {
  %y = mul <8 x i32> %x, <i32 2, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0,
i32 0>
  ret <8 x i32> %y
}

I get that the multiply is wrongly combined into vpaddd %ymm0, %ymm0, %ymm0

Another example is:
define <4 x i64> @mul_const2(<4 x i64> %x) {
  %y = mul <4 x i64> %x, <i64 4, i64 0, i64 0, i64 0>
  ret <4 x i64> %y
}

here the multiply is wrongly combined into vpsllq $2, %ymm0, %ymm0

If you want to make sure that isSplatVector returns 'true' _only_ in the case where the elements of a BUILD_VECTOR are exactly the same (or Undef), then you should check both SplatBitSize and HasAnyUndefs. Alternatively in `isSplatVector' you could implement a loop that checks if each operand of the BUILD_VECTOR is either Undef or the same ConstantSDNode.

I hope this helps!
Andrea Di Biagio
SN Systems - Sony Computer Entertainment Group

**********************************************************************
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. 
If you have received this email in error please notify postmaster at scee.net This footnote also confirms that this email message has been checked for all known viruses.
Sony Computer Entertainment Europe Limited Registered Office: 10 Great Marlborough Street, London W1F 7LP, United Kingdom Registered in England: 3277793
**********************************************************************

P Please consider the environment before printing this e-mail
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.





More information about the llvm-commits mailing list