[PATCH] D10964: [Codegen] Add intrinsics 'hadd*' and corresponding SDNodes for horizontal sum operation.
Shahid
Asghar-ahmad.Shahid at amd.com
Tue Jul 7 03:55:01 PDT 2015
Hi James,
Thanks for your comments.I will do the needful. Pls see my response below.
Regards,
Shahid
================
Comment at: docs/LangRef.rst:9593
@@ +9592,3 @@
+ declare <4 x integer> @llvm.hadd.v4i32(<4 x integer> %a)
+ declare <4 x float> @llvm.hadd.v4f32(<4 x float> %a)
+
----------------
jmolloy wrote:
> You need to be very explicit about the behaviour of this intrinsic with floating point arguments. What order, if any, does it perform the adds in? If there is no guaranteed order, it can only be used in fast-math mode.
Ah, I did not think about it.Instead of restricting it to fast-math I would prefer to have an order such as "add each element of vector, starting from element 0 to n-1, to an accumulated sum which is initialized to zero". Does it make sense?
================
Comment at: include/llvm/IR/Intrinsics.td:599
@@ +598,3 @@
+// Calculate the horizontal/reduction sum across the elements of input vector.
+def int_hadd_int : Intrinsic<[llvm_anyint_ty], [llvm_anyvector_ty], [IntrNoMem]>;
+def int_hadd_float : Intrinsic<[llvm_anyfloat_ty], [llvm_anyvector_ty], [IntrNoMem]>;
----------------
jmolloy wrote:
> Just having one intrinsic here would be good; there's no need for a separate int and float version.
In that case what should be the return type of intrinsic?
================
Comment at: lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp:712
@@ -707,1 +711,3 @@
+ case ISD::FHADD:
+ return UnrollHADD(Op);
default:
----------------
jmolloy wrote:
> Can't you just call ExpandHADD() here? or at least share the unroll and expand code?
Yes, probably I can share.
Repository:
rL LLVM
http://reviews.llvm.org/D10964
More information about the llvm-commits
mailing list