[LLVMdev] Vectorization Cost Models and Multi-Instruction Patterns?
Shahid, Asghar-ahmad
Asghar-ahmad.Shahid at amd.com
Tue Jan 20 08:37:31 PST 2015
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]
> On Behalf Of Das, Dibyendu
> Sent: Tuesday, January 20, 2015 2:42 PM
> To: Ahmed Bougacha; LLVM Dev
> Subject: Re: [LLVMdev] Vectorization Cost Models and Multi-Instruction
> Patterns?
>
> I guess it will help SAD generation also.
>
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]
> On Behalf Of Ahmed Bougacha
> Sent: Tuesday, January 20, 2015 5:21 AM
> To: LLVM Dev
> Subject: [LLVMdev] Vectorization Cost Models and Multi-Instruction
> Patterns?
>
> Hi all,
>
> While tinkering with saturation instructions, I hit problems with the cost
> model calculations.
>
> The loop vectorizer cost model accumulates the individual TTI cost model of
> each instruction. For saturating arithmetic, this is a gross overestimate, since
> you have 2 sexts (inputs), 2 icmps + 2 selects (for the saturation), and a
> truncate (output); these all fold alway.
> With an intrinsic, you'd end up with a better estimate; however, I'm trying to
> see what problems we would encounter without intrinsics, and I think this is
> the biggest one.
> Note that AFAICT, costs for min/max patterns (icmp+iselect) are also
> overestimated, but not as much as saturate.
>
>
> Proposal:
>
> Add a method, part of the vector API of TargetTransformInfo, for multi-
> instruction cost computation. It would take a scalar Instruction, and a
> reference to a set of Instruction. If it's able to match a min/max/saturate/..,
Are you trying to match the multi-instruction with library calls or Intrinsics?
If it is intrinsic, you may also need to check legality before matching.
> it adds all the matched instructions to the set, so the caller (say
> LoopVectorizationCostModel) can ignore them.
>
> But:
> - this all seems icky: a very blunt hammer.
> - what, if anything, should we do about legality checks? The expanded IR
> equivalent of a saturate uses larger types than necessary, so this might
> prevent vectorization. In practice it doesn't, because only load/store/PHI
> types are checked there.
IMO, it will still prevent vectorization because the value used by min/max/saturate
might have been defined by load/store/phi.
> - is this useful in other cases, beyond min/max (maybe abs ?) and saturate?
Yes, Sum-Of-Absolute-Differece is another case.
Regards,
Shahid
>
> Thanks!
>
> -Ahmed
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list