[PATCH] Aarch64 Neon ACLE scalar instrinsic name mangling with BHSD suffix

Thu Aug 22 19:27:53 PDT 2013

Hi Ana,

The name mangling for vpmaxnmqd_f64, vpmaxqd_f64, vpaddd_f64 is still under
review, and we will let you know the final version ASAP.

I want to make the following code change in my patch:

@@ -713,10 +735,15 @@ static std::string MangleName(const std::string
&name, StringRef typestr,

   // Insert a 'q' before the first '_' character so that it ends up before
   // _lane or _n on vector-scalar operations.
-  if (typestr.startswith("Q")) {
+  if (typestr.find("Q") != StringRef::npos) {
       size_t pos = s.find('_');
       s = s.insert(pos, "q");
   }

Now no matter if 'q' exist in the mangled name exists or not, my patch can
always manage to generate the mangled name as we want.

At present, you can define them as below,

def SCALAR_FMAXP : SInst<"vpmax", "sd", "SfSQd">;
def SCALAR_VPADD : SInst<"vpadd", "sd", "SHd">;

Proper suffix can be inserted with certain combination of type prefix.



2013/8/23 Ana Pazos <apazos at codeaurora.org>

> Hi Kevin,****
>
> ** **
>
> Thanks, it should work too. We do that in EmitIntrinsic and genTargetTest.
> ****
>
> ** **
>
> One more request:****
>
> Can you clarify on the naming convention for Scalar Reduce Pairwise
> (FMAXP, FMAXNMP, FMINP, FMINNMP) intrinsic names. Maybe ARM Ltd. has
> updated names.****
>
> ** **
>
> Gcc’s arm_neon.h adds a ‘q’ character to the intrinsic name.****
>
> ** **
>
> But another instruction from the same instruction class, FADDP, does not
> have the additional ‘q’ in its intrinsic name. That is weird since the
> prototypes are the same:****
>
> ** **
>
> float64_t vpmaxnm*q*d_f64 (float64x2_t a)****
>
> float64_t vpmax*q*d_f64 (float64x2_t a)****
>
> float64_t vpaddd_f64 (float64x2_t a)****
>
> ** **
>
> If the ‘q’ is needed, then we need to further change the patch to generate
> the correct name. This can be done now or when we check in the Scalar
> Reduce Pairwise implementation.****
>
> ** **
>
> You can reproduce the issue by adding these declarations:****
>
> def SCALAR_FMAXP : SInst<"vpmax", "sd", "SfSQd">;****
>
> def SCALAR_FMAXNMP : SInst<"vpmaxnm", "sd", "SfSQd">;****
>
> ** **
>
> Thanks,****
>
> Ana.****
>
> ** **
>
> *From:* Kevin Qin [mailto:kevinqindev at gmail.com]
> *Sent:* Wednesday, August 21, 2013 10:51 PM
> *To:* Ana Pazos
> *Cc:* Joey Gouly; Jiangning Liu; Hao Liu; cfe-commits at cs.uiuc.edu;
> mcrosier at codeaurora.org
>
> *Subject:* Re: [PATCH] Aarch64 Neon ACLE scalar instrinsic name mangling
> with BHSD suffix****
>
> ** **
>
> Hi Ana,****
>
> ** **
>
> Thanks for your feedback. I can reproduce this problem and fix it in
> attached patch. My solution is record both intrinsic name and its
> prototypes in map. So only a definition with same name and same prototypes
> will be treated as redefinition. I think this solution is more reasonable
> and fit more scenarios. I have tested this patch by comparing arm_neon.h,
> arm_neon.inc before and after this patch. They are all the same. Also I
> test defining SInst<"vqadd", "sss", "ScSsSiSlSUcSUsSUiSUl">. There is no
> compile error now.****
>
> ** **
>
> 2013/8/22 Ana Pazos <apazos at codeaurora.org>****
>
> Hi Kevin,****
>
>  ****
>
> I tried to use your patch for defining the Scalar Arithmetic and and
> Scalar Reduce Pairwise intrinsics I am working on.****
>
>  ****
>
> I encountered an issue with Scalar Saturating Add which I defined as****
>
>  ****
>
> def SCALAR_QADD   : SInst<"vqadd", "sss", "ScSsSiSlSUcSUsSUiSUl"> ****
>
>  ****
>
> and when enabling tools/clang/lib/Headers/Makefile to build the
> auto-generated headers arm_neon.h., arm_neon_test.h and arm_neon_sema.h.**
> **
>
>  ****
>
> Because I am reusing the intrinsic name “vqadd”, the ARM v7 legacy
> builtins associated with this same intrinsic name ended up not being added
> to AArch64 headers as they should.****
>
>  ****
>
> Below is one possible solution (simply do not add AArch64 Scalar
> intrinsics to the AArch64 Map structure; changed arm_neon.td and
> NeonEmitter.cpp).****
>
>  ****
>
> See if you can reproduce the issue, and if you agree with the change
> below, please add it to your patch.****
>
>  ****
>
> diff --git a/utils/TableGen/NeonEmitter.cpp
> b/utils/TableGen/NeonEmitter.cpp****
>
> index 8924278..fece66a 100644****
>
> --- a/utils/TableGen/NeonEmitter.cpp****
>
> +++ b/utils/TableGen/NeonEmitter.cpp****
>
> @@ -2396,6 +2396,8 @@ void NeonEmitter::runHeader(raw_ostream &OS) {****
>
>    std::vector<Record *> RV = Records.getAllDerivedDefinitions("Inst");***
> *
>
>  ****
>
>    // build a map of AArch64 intriniscs to be used in uniqueness checks.**
> **
>
> +  // The map does not include AArch64 scalar intrinsics because their name
> ****
>
> +  // mangling and non-overloaded builtins prevent name conflict.****
>
>    StringMap<ClassKind> A64IntrinsicMap;****
>
>    for (unsigned i = 0, e = RV.size(); i != e; ++i) {****
>
>      Record *R = RV[i];****
>
> @@ -2404,6 +2406,10 @@ void NeonEmitter::runHeader(raw_ostream &OS) {****
>
>      if (!isA64)****
>
>        continue;****
>
>  ****
>
> +       bool isScalar = R->getValueAsBit("isScalar");****
>
> +    if (isScalar)****
>
> +      continue;****
>
> +****
>
>  ****
>
> diff --git a/include/clang/Basic/arm_neon.td b/include/clang/Basic/
> arm_neon.td****
>
> index 6918f0a..0a98320 100644****
>
> --- a/include/clang/Basic/arm_neon.td****
>
> +++ b/include/clang/Basic/arm_neon.td****
>
> @@ -79,6 +79,7 @@ class Inst <string n, string p, string t, Op o> {****
>
>    bit isShift = 0;****
>
>    bit isVCVT_N = 0;****
>
>    bit isA64 = 0;****
>
> +  bit isScalar = 0;****
>
>  ****
>
>    // Certain intrinsics have different names than their representative***
> *
>
>    // instructions. This field allows us to handle this correctly when we*
> ***
>
> @@ -564,15 +565,46 @@ def SHLL_HIGH_N    : SInst<"vshll_high_n", "ndi",
> "HcHsHiHUcHUsHUi"****
>
> // Converting vectors****
>
> def VMOVL_HIGH   : SInst<"vmovl_high", "nd", "HcHsHiHUcHUsHUi">;****
>
>  ****
>
> +}****
>
>  ****
>
> +let isA64 = 1, isScalar = 1 in {****
>
>
> ////////////////////////////////////////////////////////////////////////////////
> ****
>
> // Scalar Arithmetic****
>
>  ****
>
>  ****
>
> Thanks,****
>
> Ana.****
>
>  ****
>
>  ****
>
> Some more background on the issue…****
>
>  ****
>
> Currently we allow defining AArch64 intrinsics with the same name as ARM
> intrinsics.****
>
> Usually this happens when the AArch64 intrinsic can handle more types than
> the ARM intrinsic. It is seen as a superset of ARM intrinsics.****
>
> So the Prototype string might differ, but they have the same Type string,
> e.g., "ddd", and Name string.****
>
>  ****
>
> Example:****
>
> def VADD    : IOpInst<"vadd", "ddd",
> "csilfUcUsUiUlQcQsQiQlQfQUcQUsQUiQUl", OP_ADD>; --> used by ARM NEON****
>
> ...****
>
> let isA64 = 1 in {****
>
> def ADD : IOpInst<"vadd", "ddd", "csilfUcUsUiUlQcQsQiQlQfQUcQUsQUiQUlQd",
> OP_ADD>; --> with additional Qd type, used by AArch64 NEON****
>
> ...****
>
> }****
>
>  ****
>
> When generating the intrinsics definitions and builtin definitions we
> check for such conflicts using Maps structures.****
>
> We only include ARM NEON legacy intrinsics and builtin definitions into
> AArch64 headers/tests if AArch64 does not redefine them. Otherwise the
> definitions marked as isA64 prevail (they are the superset).****
>
>  ****
>
> Now we have a new situation with scalar intrinsics whose Type strings will
> be different.****
>
> For example, Scalar Saturating Add, which requires a different Types
> string "sss".****
>
> It would be nice to define it as: SInst<"vqadd", "sss",
> "ScSsSiSlSUcSUsSUiSUl">. But it will fail to compile.****
>
> There are a couple of possible solutions.****
>
> This can be address in your patch or in the next one by whoever needs this
> fix.****
>
>  ****
>
> *From:* Kevin Qin [mailto:kevinqindev at gmail.com] ****
>
> *Sent:* Tuesday, August 20, 2013 7:02 PM
> *To:* Ana Pazos****
>
> *Cc:* Joey Gouly; Jiangning Liu; Hao Liu; llvm-commits at cs.uiuc.edu;
> cfe-commits at cs.uiuc.edu****
>
>
> *Subject:* Re: [PATCH] Aarch64 Neon ACLE scalar instrinsic name mangling
> with BHSD suffix****
>
>  ****
>
> Hi Ana,****
>
>  ****
>
> Sorry for that mistake. I merged and tested my patch on our daily updated
> internal repo. So it may have latency with truck and just missed Hao's
> patch at the time I merged my patch. Here is the patch rebased on truck
> now. Please try again. Thanks.****
>
>  ****
>
> 2013/8/21 Ana Pazos <apazos at codeaurora.org>****
>
> Hi Kevin,****
>
>  ****
>
> I am trying your patch now.****
>
> The first thing I noticed is that the patch does not merge cleanly.****
>
> It seems you do not have Hao’s previous clang commit (
> http://llvm.org/viewvc/llvm-project?view=revision&revision=188452 -Clang
> and AArch64 backend patches to support shll/shl and vmovl instructions and
> ACLE function).****
>
> Can you rebase and repost your patch?****
>
>  ****
>
> Thanks,****
>
> Ana.****
>
>  ****
>
>  ****
>
> *From:* cfe-commits-bounces at cs.uiuc.edu [mailto:
> cfe-commits-bounces at cs.uiuc.edu] *On Behalf Of *Kevin Qin
> *Sent:* Monday, August 19, 2013 2:43 AM
> *To:* Joey Gouly
> *Cc:* llvm-commits at cs.uiuc.edu; cfe-commits at cs.uiuc.edu
> *Subject:* Re: [PATCH] Aarch64 Neon ACLE scalar instrinsic name mangling
> with BHSD suffix****
>
>  ****
>
> Hi Joey,****
>
>  ****
>
>    Thanks a lot for your suggestions. I improved my patch as your good
> advice. I wish to implement some of ACLE intrinsic based on my patch as
> test, but the full implementation of one ACLE intrinsic need to commit on
> both llvm and clang. More important thing is,  all scalar ACLE intrinsic
> are classified to different tables, and these tables are implemented in
> parallel but none of them is accomplished.  There are complex dependence
> among intrinsic and may be rewrite frequently.  The main purpose posting
> this patch at moment is to decrease overlap on such base module and make
> our parallel development more efficient.****
>
>  ****
>
> Best Regards,****
>
> Kevin Qin****
>
>  ****
>
> 2013/8/16 Joey Gouly <joey.gouly at arm.com>****
>
> Hi Kevin,
>
> Clang patches should be sent to cfe-commits. I cc’d them in for you.
>
> You could test this patch by including an ACLE function that you have
> implemented, that way we can see it works. Or if you are going to submit
> those soon, maybe it can go in without a test for now. Let’s see what
> others
> think about that.
>
> Just two minor points.
>
> > +static std::string Insert_BHSD_Suffix(StringRef typestr){
>
> This should just return a 'char'.
>
> > +/// Insert proper 'b' 'h' 's' 'd' behind function name if prefix 'S' is
> used.
> Can you change that to something like
>   Insert proper 'b', 'h', 's', 'd' suffix if 'S' prefix is used.
>
> Thanks
>
> From: llvm-commits-bounces at cs.uiuc.edu
> [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Kevin Qin
> Sent: 16 August 2013 10:46
> To: llvm-commits at cs.uiuc.edu
> Subject: [PATCH] Aarch64 Neon ACLE scalar instrinsic name mangling with
> BHSD
> suffix****
>
>
> This patch is used to add ‘bhsd’ suffix in Neon ACLE scalar function name.
> 1. A new type prefix ‘S’ is defined in
> tools/clang/include/clang/Basic/arm_neon.td, which is used to mark whether
> enable ‘bhsd’ suffix mangle.
> 2. If prefix ‘S’ is found before data type, proper suffix will be added
> according below rule:
> Data Type  Suffix
>   i8 -> b
>   i16 -> h
>   i32 f32 -> s
>   i64 f64 -> d
>
> For example,
> If we define a new ACLE function in arm_neon.td like:
>
> def FABD : SInst<”vabd”, “sss”,  “ScSsSiSlSfSd”>
>
> Then, rebuild llvm. We would see below in
> INSTALL_PATH/lib/clang/3.4/include/arm_neon.h
>
> __ai int8_t vabdb_s8(int8_t __a, int8_t __b) {
>   return (int8_t)__builtin_neon_vabdb_s8(__a, __b); }
> __ai int16_t vabdh_s16(int16_t __a, int16_t __b) {
>   return (int16_t)__builtin_neon_vabdh_s8(__a, __b); }
> …
>
> Proper suffix is inserted in both ACLE intrinsics and builtin functions.
>
> Because this patch only works on llvm building time, so I have no idea on
> writing test case for this patch. Fortunately, this patch won’t effect on
> any already defined ACLE functions, for its function depending on the new
> defined prefix ‘S’. So It is safe for current system and is developed to
> help implement new scalar ACLE intrinsics in parallel.
>
> You can see each scalar ACLE function corresponds to an unique builtin
> function. At beginning, I planned to let series of ACLE functions with the
> same semantics reuse one builtin function. But this would introduce extra
> unnecessary convert expression in IR, for different data types have
> different data width. If I promote all data types to i64 and use single
> builtin function, then there will be only an i64 version builtin function
> existing in IR. Though I can add extra arg in builtin function to record
> the
> original data type, but extra data convert IR(sign extension before call
> and
> truncation after call) still exists. This will increase the difficulty in
> coding lower patterns in backend.
>  ****
>
>
>
> ****
>
>  ****
>
> -- ****
>
> Best Regards,****
>
>  ****
>
> Kevin Qin****
>
>
>
> ****
>
>  ****
>
> -- ****
>
> Best Regards,****
>
>  ****
>
> Kevin Qin****
>
>
>
> ****
>
> ** **
>
> -- ****
>
> Best Regards,****
>
> ** **
>
> Kevin Qin****
>



-- 
Best Regards,

Kevin Qin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130823/b0ac4606/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: BHSD_mangling_clang_truck5.patch
Type: application/octet-stream
Size: 6312 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130823/b0ac4606/attachment.obj>