[PATCH] Aarch64 Neon ACLE scalar instrinsic name mangling with BHSD suffix

Wed Aug 21 22:51:18 PDT 2013

Hi Ana,

Thanks for your feedback. I can reproduce this problem and fix it in
attached patch. My solution is record both intrinsic name and its
prototypes in map. So only a definition with same name and same prototypes
will be treated as redefinition. I think this solution is more reasonable
and fit more scenarios. I have tested this patch by comparing arm_neon.h,
arm_neon.inc before and after this patch. They are all the same. Also I
test defining SInst<"vqadd", "sss", "ScSsSiSlSUcSUsSUiSUl">. There is no
compile error now.

2013/8/22 Ana Pazos <apazos at codeaurora.org>

> Hi Kevin,****
>
> ** **
>
> I tried to use your patch for defining the Scalar Arithmetic and and
> Scalar Reduce Pairwise intrinsics I am working on.****
>
> ** **
>
> I encountered an issue with Scalar Saturating Add which I defined as****
>
> ** **
>
> def SCALAR_QADD   : SInst<"vqadd", "sss", "ScSsSiSlSUcSUsSUiSUl"> ****
>
> ** **
>
> and when enabling tools/clang/lib/Headers/Makefile to build the
> auto-generated headers arm_neon.h., arm_neon_test.h and arm_neon_sema.h.**
> **
>
> ** **
>
> Because I am reusing the intrinsic name “vqadd”, the ARM v7 legacy
> builtins associated with this same intrinsic name ended up not being added
> to AArch64 headers as they should.****
>
> ** **
>
> Below is one possible solution (simply do not add AArch64 Scalar
> intrinsics to the AArch64 Map structure; changed arm_neon.td and
> NeonEmitter.cpp).****
>
> ** **
>
> See if you can reproduce the issue, and if you agree with the change
> below, please add it to your patch.****
>
> ** **
>
> diff --git a/utils/TableGen/NeonEmitter.cpp
> b/utils/TableGen/NeonEmitter.cpp****
>
> index 8924278..fece66a 100644****
>
> --- a/utils/TableGen/NeonEmitter.cpp****
>
> +++ b/utils/TableGen/NeonEmitter.cpp****
>
> @@ -2396,6 +2396,8 @@ void NeonEmitter::runHeader(raw_ostream &OS) {****
>
>    std::vector<Record *> RV = Records.getAllDerivedDefinitions("Inst");***
> *
>
> ** **
>
>    // build a map of AArch64 intriniscs to be used in uniqueness checks.**
> **
>
> +  // The map does not include AArch64 scalar intrinsics because their name
> ****
>
> +  // mangling and non-overloaded builtins prevent name conflict.****
>
>    StringMap<ClassKind> A64IntrinsicMap;****
>
>    for (unsigned i = 0, e = RV.size(); i != e; ++i) {****
>
>      Record *R = RV[i];****
>
> @@ -2404,6 +2406,10 @@ void NeonEmitter::runHeader(raw_ostream &OS) {****
>
>      if (!isA64)****
>
>        continue;****
>
> ** **
>
> +       bool isScalar = R->getValueAsBit("isScalar");****
>
> +    if (isScalar)****
>
> +      continue;****
>
> +****
>
> ** **
>
> diff --git a/include/clang/Basic/arm_neon.td b/include/clang/Basic/
> arm_neon.td****
>
> index 6918f0a..0a98320 100644****
>
> --- a/include/clang/Basic/arm_neon.td****
>
> +++ b/include/clang/Basic/arm_neon.td****
>
> @@ -79,6 +79,7 @@ class Inst <string n, string p, string t, Op o> {****
>
>    bit isShift = 0;****
>
>    bit isVCVT_N = 0;****
>
>    bit isA64 = 0;****
>
> +  bit isScalar = 0;****
>
> ** **
>
>    // Certain intrinsics have different names than their representative***
> *
>
>    // instructions. This field allows us to handle this correctly when we*
> ***
>
> @@ -564,15 +565,46 @@ def SHLL_HIGH_N    : SInst<"vshll_high_n", "ndi",
> "HcHsHiHUcHUsHUi"****
>
> // Converting vectors****
>
> def VMOVL_HIGH   : SInst<"vmovl_high", "nd", "HcHsHiHUcHUsHUi">;****
>
> ** **
>
> +}****
>
> ** **
>
> +let isA64 = 1, isScalar = 1 in {****
>
>
> ////////////////////////////////////////////////////////////////////////////////
> ****
>
> // Scalar Arithmetic****
>
> ** **
>
> ** **
>
> Thanks,****
>
> Ana.****
>
> ** **
>
> ** **
>
> Some more background on the issue…****
>
> ** **
>
> Currently we allow defining AArch64 intrinsics with the same name as ARM
> intrinsics.****
>
> Usually this happens when the AArch64 intrinsic can handle more types than
> the ARM intrinsic. It is seen as a superset of ARM intrinsics.****
>
> So the Prototype string might differ, but they have the same Type string,
> e.g., "ddd", and Name string.****
>
> ** **
>
> Example:****
>
> def VADD    : IOpInst<"vadd", "ddd",
> "csilfUcUsUiUlQcQsQiQlQfQUcQUsQUiQUl", OP_ADD>; --> used by ARM NEON****
>
> ...****
>
> let isA64 = 1 in {****
>
> def ADD : IOpInst<"vadd", "ddd", "csilfUcUsUiUlQcQsQiQlQfQUcQUsQUiQUlQd",
> OP_ADD>; --> with additional Qd type, used by AArch64 NEON****
>
> ...****
>
> }****
>
> ** **
>
> When generating the intrinsics definitions and builtin definitions we
> check for such conflicts using Maps structures.****
>
> We only include ARM NEON legacy intrinsics and builtin definitions into
> AArch64 headers/tests if AArch64 does not redefine them. Otherwise the
> definitions marked as isA64 prevail (they are the superset).****
>
> ** **
>
> Now we have a new situation with scalar intrinsics whose Type strings will
> be different.****
>
> For example, Scalar Saturating Add, which requires a different Types
> string "sss".****
>
> It would be nice to define it as: SInst<"vqadd", "sss",
> "ScSsSiSlSUcSUsSUiSUl">. But it will fail to compile.****
>
> There are a couple of possible solutions.****
>
> This can be address in your patch or in the next one by whoever needs this
> fix.****
>
> ** **
>
> *From:* Kevin Qin [mailto:kevinqindev at gmail.com]
> *Sent:* Tuesday, August 20, 2013 7:02 PM
> *To:* Ana Pazos
> *Cc:* Joey Gouly; Jiangning Liu; Hao Liu; llvm-commits at cs.uiuc.edu;
> cfe-commits at cs.uiuc.edu
>
> *Subject:* Re: [PATCH] Aarch64 Neon ACLE scalar instrinsic name mangling
> with BHSD suffix****
>
> ** **
>
> Hi Ana,****
>
> ** **
>
> Sorry for that mistake. I merged and tested my patch on our daily updated
> internal repo. So it may have latency with truck and just missed Hao's
> patch at the time I merged my patch. Here is the patch rebased on truck
> now. Please try again. Thanks.****
>
> ** **
>
> 2013/8/21 Ana Pazos <apazos at codeaurora.org>****
>
> Hi Kevin,****
>
>  ****
>
> I am trying your patch now.****
>
> The first thing I noticed is that the patch does not merge cleanly.****
>
> It seems you do not have Hao’s previous clang commit (
> http://llvm.org/viewvc/llvm-project?view=revision&revision=188452 -Clang
> and AArch64 backend patches to support shll/shl and vmovl instructions and
> ACLE function).****
>
> Can you rebase and repost your patch?****
>
>  ****
>
> Thanks,****
>
> Ana.****
>
>  ****
>
>  ****
>
> *From:* cfe-commits-bounces at cs.uiuc.edu [mailto:
> cfe-commits-bounces at cs.uiuc.edu] *On Behalf Of *Kevin Qin
> *Sent:* Monday, August 19, 2013 2:43 AM
> *To:* Joey Gouly
> *Cc:* llvm-commits at cs.uiuc.edu; cfe-commits at cs.uiuc.edu
> *Subject:* Re: [PATCH] Aarch64 Neon ACLE scalar instrinsic name mangling
> with BHSD suffix****
>
>  ****
>
> Hi Joey,****
>
>  ****
>
>    Thanks a lot for your suggestions. I improved my patch as your good
> advice. I wish to implement some of ACLE intrinsic based on my patch as
> test, but the full implementation of one ACLE intrinsic need to commit on
> both llvm and clang. More important thing is,  all scalar ACLE intrinsic
> are classified to different tables, and these tables are implemented in
> parallel but none of them is accomplished.  There are complex dependence
> among intrinsic and may be rewrite frequently.  The main purpose posting
> this patch at moment is to decrease overlap on such base module and make
> our parallel development more efficient.****
>
>  ****
>
> Best Regards,****
>
> Kevin Qin****
>
>  ****
>
> 2013/8/16 Joey Gouly <joey.gouly at arm.com>****
>
> Hi Kevin,
>
> Clang patches should be sent to cfe-commits. I cc’d them in for you.
>
> You could test this patch by including an ACLE function that you have
> implemented, that way we can see it works. Or if you are going to submit
> those soon, maybe it can go in without a test for now. Let’s see what
> others
> think about that.
>
> Just two minor points.
>
> > +static std::string Insert_BHSD_Suffix(StringRef typestr){
>
> This should just return a 'char'.
>
> > +/// Insert proper 'b' 'h' 's' 'd' behind function name if prefix 'S' is
> used.
> Can you change that to something like
>   Insert proper 'b', 'h', 's', 'd' suffix if 'S' prefix is used.
>
> Thanks
>
> From: llvm-commits-bounces at cs.uiuc.edu
> [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Kevin Qin
> Sent: 16 August 2013 10:46
> To: llvm-commits at cs.uiuc.edu
> Subject: [PATCH] Aarch64 Neon ACLE scalar instrinsic name mangling with
> BHSD
> suffix****
>
>
> This patch is used to add ‘bhsd’ suffix in Neon ACLE scalar function name.
> 1. A new type prefix ‘S’ is defined in
> tools/clang/include/clang/Basic/arm_neon.td, which is used to mark whether
> enable ‘bhsd’ suffix mangle.
> 2. If prefix ‘S’ is found before data type, proper suffix will be added
> according below rule:
> Data Type  Suffix
>   i8 -> b
>   i16 -> h
>   i32 f32 -> s
>   i64 f64 -> d
>
> For example,
> If we define a new ACLE function in arm_neon.td like:
>
> def FABD : SInst<”vabd”, “sss”,  “ScSsSiSlSfSd”>
>
> Then, rebuild llvm. We would see below in
> INSTALL_PATH/lib/clang/3.4/include/arm_neon.h
>
> __ai int8_t vabdb_s8(int8_t __a, int8_t __b) {
>   return (int8_t)__builtin_neon_vabdb_s8(__a, __b); }
> __ai int16_t vabdh_s16(int16_t __a, int16_t __b) {
>   return (int16_t)__builtin_neon_vabdh_s8(__a, __b); }
> …
>
> Proper suffix is inserted in both ACLE intrinsics and builtin functions.
>
> Because this patch only works on llvm building time, so I have no idea on
> writing test case for this patch. Fortunately, this patch won’t effect on
> any already defined ACLE functions, for its function depending on the new
> defined prefix ‘S’. So It is safe for current system and is developed to
> help implement new scalar ACLE intrinsics in parallel.
>
> You can see each scalar ACLE function corresponds to an unique builtin
> function. At beginning, I planned to let series of ACLE functions with the
> same semantics reuse one builtin function. But this would introduce extra
> unnecessary convert expression in IR, for different data types have
> different data width. If I promote all data types to i64 and use single
> builtin function, then there will be only an i64 version builtin function
> existing in IR. Though I can add extra arg in builtin function to record
> the
> original data type, but extra data convert IR(sign extension before call
> and
> truncation after call) still exists. This will increase the difficulty in
> coding lower patterns in backend.
>
>
> ****
>
>
>
> ****
>
>  ****
>
> -- ****
>
> Best Regards,****
>
>  ****
>
> Kevin Qin****
>
>
>
> ****
>
> ** **
>
> -- ****
>
> Best Regards,****
>
> ** **
>
> Kevin Qin****
>

-- 
Best Regards,

Kevin Qin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130822/7fead3e2/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: BHSD_mangling_clang_truck4.patch
Type: application/octet-stream
Size: 6071 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130822/7fead3e2/attachment.obj>