Please review: Optimize vector multiply on X86
Nadav Rotem
nrotem at apple.com
Tue Jun 25 17:33:40 PDT 2013
Hi Elena,
isSplatVector does not do what the name says. It checks it there is a splat constant. It should have a different name.
+ N0IsConst = isSplatVector(N0.getNode(), ConstValue0);
+ N1IsConst = isSplatVector(N1.getNode(), ConstValue1);
}
+ else {
+ N0IsConst = dyn_cast<ConstantSDNode>(N0) != 0;
+ Const
The else should be on the same line.
Nadav
On 06/25/13, "Demikhovsky, Elena" <elena.demikhovsky at intel.com> wrote:
> Can anybody, please, take a look at this patch, I'd like to commit it.
>
> Thank you.
>
> - Elena
>
>
> -----Original Message-----
> From: Demikhovsky, Elena
> Sent: Sunday, June 23, 2013 14:00
> To: Andrea_DiBiagio at sn.scee.net
> Cc: Benjamin Kramer; llvm-commits at cs.uiuc.edu; llvm-commits-bounces at cs.uiuc.edu
> Subject: RE: Please review: Optimize vector multiply on X86
>
> I caught your point. I also fixed the same bug on SDIV lowering. This is the new patch.
>
> - Elena
>
>
> -----Original Message-----
> From: Andrea_DiBiagio at sn.scee.net [mailto:Andrea_DiBiagio at sn.scee.net] <Andrea_DiBiagio at sn.scee.net]>
> Sent: Thursday, June 20, 2013 16:15
> To: Demikhovsky, Elena
> Cc: Benjamin Kramer; llvm-commits at cs.uiuc.edu; llvm-commits-bounces at cs.uiuc.edu
> Subject: RE: Please review: Optimize vector multiply on X86
>
> Hi Elena,
>
> > From: "Demikhovsky, Elena" <elena.demikhovsky at intel.com> Hi Andrea,
> >
> > You are right that 'isSplatVector' returns true if elements are not
> > the same. But later I check the slat value for 0, -1, pow2 And in the
> > tests I check everything:
>
> What I am saying is that the size in bit of the splat value returned by 'isSplatVector' matters. For example, it may be bigger than the vector element size in bits.
>
> Example, given the following <4 x i64> constant vectors:
> a) <i64 2, i64 0, i64 0, i64 0>
> b) <i64 2, i64 0, i64 2, i64 0>
> c) <i64 2, i64 2, i64 2, i64 2>
>
> They are all considered to be valid vector splats according to 'isSplatVector' and the splat value is the same for all of them. What changes however is the SplatBitSize:
> a) SplatValue = 2 ;; SplatBitSize = 256;
> b) SplatValue = 2 ;; SplatBitSize = 128;
> c) SplatValue = 2 ;; SplatBitSize = 64;
>
> > ==== Multiple 2 gives add
> > +define <8 x i32> @mul_const1(<8 x i32> %x) {
> > + %y = mul <8 x i32> %x, <i32 2, i32 2, i32 2, i32 2, i32 2, i32 2,
> > i32 2, i32 2>
> > + ret <8 x i32> %y
> > +}
>
> In the example above:
> if you substitute the second operand of the multiply instruction with any of the following constant vectors:
>
> - <i32 2, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
> - <i32 2, i32 0, i32 0, i32 0, i32 2, i32 0, i32 0, i32 0>
> - <i32 2, i32 0, i32 2, i32 0, i32 2, i32 0, i32 2, i32 0>
>
> You will find out that your new combine transformation always optimizes the multiply into an add, and that causes incorrect codegen.
> That is because the compiler applies the following sequence of rules:
> 1) fold (mul x, (1 << c)) -> x << c (in DAGCombiner::visitMul())
> 2) (shl V, 1) -> add V, V (in X86ISelLowering.cpp, function
> PerformSHLCombine())
>
> > === non-spalt int value is not optimized
> > +define <8 x i32> @mul_const6(<8 x i32> %x) {
> > + %y = mul <8 x i32> %x, <i32 0, i32 0, i32 0, i32 2, i32 0, i32 2,
> > i32 0, i32 0>
> > + ret <8 x i32> %y
> > +}
>
> That is true for the example above but it is not always true in general.
> If you change the code above into:
>
> define <8 x i32> @mul_const6(<8 x i32> %x) {
> %y = mul <8 x i32> %x, <i32 2, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0,
> i32 0>
> ret <8 x i32> %y
> }
>
> You will get that the multiply is wrongly converted into an add.
>
> Andrea Di Biagio
> SN Systems - Sony Computer Entertainment Group
>
>
> **********************************************************************
> This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed.
> If you have received this email in error please notify postmaster at scee.net This footnote also confirms that this email message has been checked for all known viruses.
> Sony Computer Entertainment Europe Limited Registered Office: 10 Great Marlborough Street, London W1F 7LP, United Kingdom Registered in England: 3277793
> **********************************************************************
>
> P Please consider the environment before printing this e-mail
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130625/636f582d/attachment.html>
More information about the llvm-commits
mailing list