[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

Tue May 5 08:09:43 PDT 2015

On 5 May 2015 at 15:41, Shahid, Asghar-ahmad
<Asghar-ahmad.Shahid at amd.com> wrote:
> With llvm.sad() intrinsic:
> VC1 (Vector Cost) = Cost associated with "PSAD" instruction.
>
> W/ llvm.absd() and llvm.hadd()
> VC2  = Cost associated with "absolute diff" +  "horizontal add" ( ??? )
>
> As I will be querying with getIntrinsicCost(ID) for these two intrinsics separately, Will VC1==VC2?

I see. You are correct to say that this is a crude approximation.

The way we do today is to get one of them and treat as "cheap", or if
not possible, to hope it'll dilute amidst other more expensive
instructions. Since the cost table is mostly to get it going, having
2/4 of the cost instead of 1/4 of the cost (for diff+add of 4-way
vectors instead of diff+add of 4 scalars) will count little to the
final score and it'll probably encourage vectorization. On the generic
cases that we fail to vectorize, we end up increasing the cost of the
scalar operations.

I agree this is far from ideal, but it works reasonably well. The
alternative would be to have instructions pattern support, which would
give us more fine grained control. I have suggested this many years
ago, but so far, the current model is working well enough so that we
haven't felt the need to implement a complicated pattern matching
support.

The cases where a pattern match would help are mainly:

* Detecting cases where the back end has special instructions for
multiple IR instructions. This is your case, and is common enough that
should benefit almost all back-ends.

* Hazard detection, for instance when moving in and out of VFP
registers, or when two instructions in sequence are really bad in
specific CPUs. This would also benefit multiple back-ends, but
probably has less impact on the quality of the choices.

However, we should first try the current model, and only go towards
the more complex model if we have enough patterns that would benefit
strongly enough to compensate for the increase in complexity. This
should be a consensus decision, I think.

In any case, not an argument to implement intrisics just because the
cost model is not accurate enough. If anything, we should fix the cost
model.

cheers,
--renato