[PATCH] [PATCH][CodeGen] Adding "llvm.sad" intrinsic and corresponding ISD::SAD node for "Sum Of Absolute Differences" operation

James Molloy james at jamesmolloy.co.uk
Thu Apr 16 11:04:05 PDT 2015


Hi Ahmed,

I do agree with you. However unwinding all the way and just using generic
IR leaves you quite a long chain of instructions to match which may well be
fragile in the face of IR and DAG optimizations. DAG in particular has a
really irritating habit of moving nodes like truncates/extends around
making long chains like this difficult to match.

We've got:
  sum_all_lanes( abs( sub ) ) to match.

If sum_all_lanes and abs were single nodes, I think this would be perfectly
good enough. As it is, sum_all_lanes is log2(N) shuffles and abs() is a
comparison and vector select.

Perhaps just having intrinsics for horizontal summations and the absolute
value might be good enough? I'm not sure.

Cheers,

James

On Thu, 16 Apr 2015 at 18:52 Ahmed Bougacha <ahmed.bougacha at gmail.com>
wrote:

> FWIW I agree with James' concerns: I thought the intrinsic had already
> been agreed upon, but that doesn't seem to be the case.
>
> This got me thinking, and perhaps that's what you're saying James:  if
> the SAD intrinsic is too X86-specific, why do we even need an
> intrinsic?   We already express horizontal reductions using the
> expanded form (if that's not ideal, then that's a separate topic);
> the SUB is just a "sub <16 x i8>".
>
> The tricky part is the ABS, which AFAIK we express using the expanded
> form as well (InstCombine seems to agree).
>
> If the only reason for the intrinsic is making it easier on the
> vectorizer (I haven't really followed that discussion), it sounds like
> there's something to be done for the more general problem (e.g. MAX,
> or SAT, which I have not gotten back to yet).
>
> Am I making sense?
>
> -Ahmed
>
> On Thu, Apr 16, 2015 at 2:30 AM, James Molloy <james.molloy at arm.com>
> wrote:
> > Hi,
> >
> > Thanks for continuing to work on this!
> >
> > From a high level, I still stand by the comments I made on your
> LoopVectorizer review that this is a bit too Intel-specific. ARM and
> AArch64's SAD instructions do not behave as you model here. You model SAD
> as a per-element subtract, abs, then a cross-lane (horizontal) summation.
> The VABD instruction in ARM does not do the last step - that is assumed to
> be done in a separate instruction.
> >
> > Now, we can model your intrinsic semantics in ARM using:
> >
> >   val2 = ABD val0, val1
> >   val3 = ADDV val2
> >
> > However that isn't very efficient - the sum-across-lanes is expensive.
> Ideally, in a loop, we'd do something like this:
> >
> >   loop:
> >     reduction1 = ABD array0[i], array1[i]
> >     br loop
> >   val = ADDV reduction1
> >
> > So we'd only do the horizontal step once, not on every iteration. That
> is why I think having the intrinsic represent just the per-element
> operations, not the horizontal part, would be the lowest common denominator
> between our two targets. It is easier to pattern match two nodes than to
> split one node apart and LICM part of it.
> >
> > Similarly, the above construction should the loop vectorizer emit it is
> not very good for you. You would never be able to match your PSAD
> instruction. So my suggestion is this:
> >
> > - The intrinsic only models the per-element operations, not the
> horizontal part.
> > - The X86 backend pattern matches the sad intrinsic plus a horizontal
> reduction to its PSAD node.
> > - The Loop Vectorizer has code to emit *either* the per-element-only or
> the per-element-plus-horizontal SAD as part of its reduction logic.
> > - The TargetTransformInfo hook is extended to return an `enum SADType`,
> in much the same way that `popcnt` type is done currently.
> >
> > How does this sound?
> >
> > Cheers,
> >
> > James
> >
> >
> > REPOSITORY
> >   rL LLVM
> >
> > http://reviews.llvm.org/D9029
> >
> > EMAIL PREFERENCES
> >   http://reviews.llvm.org/settings/panel/emailpreferences/
> >
> >
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150416/a9702911/attachment.html>


More information about the llvm-commits mailing list