[LLVMdev] Vectorization Cost Models and Multi-Instruction Patterns?

Tue Jan 20 01:11:36 PST 2015

I guess it will help SAD generation also.

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Ahmed Bougacha
Sent: Tuesday, January 20, 2015 5:21 AM
To: LLVM Dev
Subject: [LLVMdev] Vectorization Cost Models and Multi-Instruction Patterns?

Hi all,

While tinkering with saturation instructions, I hit problems with the cost model calculations.

The loop vectorizer cost model accumulates the individual TTI cost model of each instruction.  For saturating arithmetic, this is a gross overestimate, since you have 2 sexts (inputs), 2 icmps + 2 selects (for the saturation), and a truncate (output); these all fold alway.
With an intrinsic, you'd end up with a better estimate; however, I'm trying to see what problems we would encounter without intrinsics, and I think this is the biggest one.
Note that AFAICT, costs for min/max patterns (icmp+iselect) are also overestimated, but not as much as saturate.

Proposal:

Add a method, part of the vector API of TargetTransformInfo, for multi-instruction cost computation.  It would take a scalar Instruction, and a reference to a set of Instruction.  If it's able to match a min/max/saturate/.., it adds all the matched instructions to the set, so the caller (say LoopVectorizationCostModel) can ignore them.

But:
-  this all seems icky: a very blunt hammer.
-  what, if anything, should we do about legality checks?  The expanded IR equivalent of a saturate uses larger types than necessary, so this might prevent vectorization.  In practice it doesn't, because only load/store/PHI types are checked there.
-  is this useful in other cases, beyond min/max (maybe abs ?)  and saturate?

Thanks!

-Ahmed
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev