[LLVMdev] Failure to optimize vector select

Tue Aug 20 15:11:14 PDT 2013

Nadav,
 I think what matt was looking for is why the slp-vectorizer is not vectorizing the booleans? To me it seems like the vectorizer got the first step right(vectorizing the operands), but not the second step(vectorizing the comparison operation). I actually would expect a single icmp ne <4 x i32> %c, <4 x i32><i32 0, i32 0, i32 0, i32 0> instead of 4 icmp's.

Micah

> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
> Behalf Of Nadav Rotem
> Sent: Tuesday, August 20, 2013 2:49 PM
> To: Matt Arsenault
> Cc: Mailing List
> Subject: Re: [LLVMdev] Failure to optimize vector select
> 
> Hi Matt,
> 
> This code maintains a vector of float4 and it inserts and extracts values from
> this vector. The 'select' operations are already vectorized. Maybe a sequence
> of inst-combines (or DAG-combines) can help. If you re-write this code using
> scalars then the slp-vectorizer, with some tweaks, will be able to catch it.
> 
> Thanks,
> Nadav
> 
> 
> On Aug 20, 2013, at 1:14 PM, Matt Arsenault <arsenm2 at gmail.com> wrote:
> 
> > On Aug 20, 2013, at 10:22 , Nadav Rotem <nrotem at apple.com> wrote:
> >
> >> Can you send the IR of the function ?
> >
> > Attached is the -O0 and -O3 IR
> >
> > <vselect_optimized.ll><vselect_unoptimized.ll>
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev