[PATCH] D10964: [Codegen] Add intrinsics 'hadd*' and corresponding SDNodes for horizontal sum operation.

Tue Jul 7 06:38:43 PDT 2015

Hi,

> Ah, I did not think about it.Instead of restricting it to fast-math I
would prefer to have an order such as "add each element of vector, starting
from element 0 to n-1, to an accumulated sum which is initialized to zero".
Does it make sense?

That would mean you wouldn't be able to lower it using a lg(n)-shuffles
algorithm, as that does it in the wrong order. You'd have to use a linear
algorithm which would perform quite poorly. It would also stop horizontal
add instructions being used on architectures that support them (I don't
know of any that do for FP types - probably for this reason!). I'd probably
go with the fast-math version personally.

>In that case what should be the return type of intrinsic?

llvm_any_ty ?

On Tue, 7 Jul 2015 at 11:59 Shahid <Asghar-ahmad.Shahid at amd.com> wrote:

> Hi James,
>
> Thanks for your comments.I will do the needful. Pls see my response below.
>
> Regards,
> Shahid
>
>
> ================
> Comment at: docs/LangRef.rst:9593
> @@ +9592,3 @@
> +      declare <4 x integer> @llvm.hadd.v4i32(<4 x integer> %a)
> +      declare <4 x float> @llvm.hadd.v4f32(<4 x float> %a)
> +
> ----------------
> jmolloy wrote:
> > You need to be very explicit about the behaviour of this intrinsic with
> floating point arguments. What order, if any, does it perform the adds in?
> If there is no guaranteed order, it can only be used in fast-math mode.
> Ah, I did not think about it.Instead of restricting it to fast-math I
> would prefer to have an order such as "add each element of vector, starting
> from element 0 to n-1, to an accumulated sum which is initialized to zero".
> Does it make sense?
>
> ================
> Comment at: include/llvm/IR/Intrinsics.td:599
> @@ +598,3 @@
> +// Calculate the horizontal/reduction sum across the elements of input
> vector.
> +def int_hadd_int : Intrinsic<[llvm_anyint_ty], [llvm_anyvector_ty],
> [IntrNoMem]>;
> +def int_hadd_float : Intrinsic<[llvm_anyfloat_ty], [llvm_anyvector_ty],
> [IntrNoMem]>;
> ----------------
> jmolloy wrote:
> > Just having one intrinsic here would be good; there's no need for a
> separate int and float version.
> In that case what should be the return type of intrinsic?
>
> ================
> Comment at: lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp:712
> @@ -707,1 +711,3 @@
> +  case ISD::FHADD:
> +    return UnrollHADD(Op);
>    default:
> ----------------
> jmolloy wrote:
> > Can't you just call ExpandHADD() here? or at least share the unroll and
> expand code?
> Yes, probably I can share.
>
>
> Repository:
>   rL LLVM
>
> http://reviews.llvm.org/D10964
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150707/8a7ec188/attachment.html>