[PATCH] Rewrite arm_neon.h generation from scratch

James Molloy James.Molloy at arm.com
Thu Jun 5 04:42:08 PDT 2014


Hi Jiangning,

Thanks for the feedback!

> It's a little bit strange at my first look. Do we have to introduce this? Or can we have an even general solution like NameMangling<Remove, "_high"> to generate whatever names we want? But I would still be OK, if you insist on keeping this operation. At least we don't need over-design.

I think generalising to a (mangled_name "remove", "_high") operation is a good idea.

> For shuffle stuff ... Or they can be essentially an extension to SetTheory instead?

They actually are already implemented as extensions to SetTheory :) That is why they need to be defs, and can't be sub-operations of another def.

Cheers,

James

From: Jiangning Liu [mailto:liujiangning1 at gmail.com]
Sent: 05 June 2014 11:41
To: Jim Grosbach
Cc: James Molloy; Jiangning Liu; Tim Northover; cfe-commits at cs.uiuc.edu
Subject: Re: [PATCH] Rewrite arm_neon.h generation from scratch

Hi James,

Generally I think I like this change, and I see two other benefits at least,

1) The function body of intrinsics defined in arm_neon.h is amazingly consistent everywhere.
2) The solution of solving name conflict brought by nested macros turns to be so simple, and we don't need to care this rat hole any longer. Previously, we came across this issue several times, and it can only be fixed case by case.

For remove_high,

+// remove_high - Return the name of the current intrinsic with the string
+//               "_high" removed. Raises an error if "_high" does not exist
+//               in the current intrinsic name.
+// example: (call (remove_high), $p0) (to call the non-high version of this
+//           intrinsic).
+def remove_high;

It's a little bit strange at my first look. Do we have to introduce this? Or can we have an even general solution like NameMangling<Remove, "_high"> to generate whatever names we want? But I would still be OK, if you insist on keeping this operation. At least we don't need over-design.

For shuffle stuff,

+// rotl - Rotate set left by a number of elements.
+// example: (rotl mask0, 3) -> [3, 4, 5, 6, 0, 1, 2]
+def rotl;
+// rotl - Rotate set right by a number of elements.
+// example: (rotr mask0, 3) -> [4, 5, 6, 0, 1, 2, 3]
+def rotr;
+// highhalf - Take only the high half of the input.
+// example: (highhalf mask0) -> [4, 5, 6, 7] (assuming mask0 had 8 elements)
+def highhalf;
+// highhalf - Take only the low half of the input.
+// example: (lowhalf mask0) -> [0, 1, 2, 3] (assuming mask0 had 8 elements)
+def lowhalf;
+// rev - Perform a variable-width reversal of the elements. The zero'th argument
+//       is a width in bits to reverse. The lanes this maps to is determined
+//       based on the element width of the underlying type.
+// example: (rev 32, mask0) -> [3, 2, 1, 0, 7, 6, 5, 4] (if 8-bit elements)
+// example: (rev 32, mask0) -> [1, 0, 3, 2]             (if 16-bit elements)
+def rev;
+// mask0 - The initial sequence of lanes for shuffle ARG0
+def mask0 : MaskExpand;
+// mask0 - The initial sequence of lanes for shuffle ARG1
+def mask1 : MaskExpand;

can they be all dropped into another independent class or a single operation with parameter, so we can make an even clean Operation classification? Or they can be essentially an extension to SetTheory instead?

Thanks,
-Jiangning


2014-06-05 16:23 GMT+08:00 Jiangning Liu <liujiangning1 at gmail.com>:
Hi James,

For first patch of moving SetTheory location is lack of the change of CMakeLists.txt for TableGen folder, so it would trigger cmake configure failure otherwise.

--- a/lib/TableGen/CMakeLists.txt
+++ b/lib/TableGen/CMakeLists.txt
@@ -6,4 +6,5 @@ add_llvm_library(LLVMTableGen
   TableGenBackend.cpp
   TGLexer.cpp
   TGParser.cpp
+  SetTheory.cpp
   )

Thanks,
-Jiangning


2014-06-05 8:20 GMT+08:00 Jim Grosbach <grosbach at apple.com>:

Woah. This sounds fantastic. I'm at WWDC this week so won't be able to give this an in depth review for a bit. Please don't mistake my silence as lack of interest! If I haven't responded in more depth by early next week, please ping me.

Jim.

> On Jun 4, 2014, at 3:39 AM, James Molloy <james.molloy at arm.com> wrote:
>
> Hi all,
>
> [Obvious stakeholders put on CC, feel free to add more]
>
> NeonEmitter.cpp, the arm_neon.td tablegen backend, really needs improving. I
> wanted
> to change the behaviour slightly for big endian, but quickly realised that
> this may
> well be the hack that broke the camel's back. I tried to incrementally
> refactor it
> but to be honest it's beyond repair. The attached patch reimplements it from
> scratch.
> Luckily the testing coverage on this stuff is absolutely massive, both with
> regression tests and the "emperor" random test case generator.
>
> The main change is that previously, in arm_neon.td a bunch of "Operation"s
> were
> defined with special names. NeonEmitter.cpp knew about these Operations and
> would emit code based on a huge switch. Actually this doesn't make much
> sense -
> the type information was held as strings, so type checking was impossible.
> Also
> TableGen's DAG type actually suits this sort of code generation very well
> (surprising that...)
>
> So now every operation is defined in terms of TableGen DAGs. There are a
> bunch
> of operators to use, including "op" (a generic unary or binary operator),
> "call"
> (to call other intrinsics) and "shuffle" (take a guess...). One of the main
> advantages of this apart from making it more obvious what is going on, is
> that
> we have proper type inference. This has two obvious advantages:
>
>  1) TableGen can error on bad intrinsic definitions easier, instead of just
>     generating wrong code.
>  2) Calls to other intrinsics are typechecked too. So
>     we no longer need to work out whether the thing we call needs to be the
> Q-lane
>     version or the D-lane version - TableGen knows that itself!
>
> Here's an example: before:
>
>  case OpAbdl: {
>    std::string abd = MangleName("vabd", typestr, ClassS) + "(__a, __b)";
>    if (typestr[0] != 'U') {
>      // vabd results are always unsigned and must be zero-extended.
>      std::string utype = "U" + typestr.str();
>      s += "(" + TypeString(proto[0], typestr) + ")";
>      abd = "(" + TypeString('d', utype) + ")" + abd;
>      s += Extend(utype, abd) + ";";
>    } else {
>      s += Extend(typestr, abd) + ";";
>    }
>    break;
>  }
>
> after:
>
>  def OP_ABDL     : Op<(cast "R", (call "vmovl", (cast $p0, "U",
>                                                       (call "vabd", $p0,
> $p1))))>;
>
> As an example of what happens if you do something wrong now, here's what
> happens
> if you make $p0 unsigned before the call to "vabd" - that is, $p0 -> (cast
> "U", $p0):
>
> arm_neon.td:574:1: error: No compatible intrinsic found - looking up
> intrinsic 'vabd(uint8x8_t, int8x8_t)'
> Available overloads:
>  - float64x2_t vabdq_v(float64x2_t, float64x2_t)
>  - float64x1_t vabd_v(float64x1_t, float64x1_t)
>  - float64_t vabdd_f64(float64_t, float64_t)
>  - float32_t vabds_f32(float32_t, float32_t)
> ... snip ...
>
> This makes it seriously easy to work out what you've done wrong in fairly
> nasty
> intrinsics.
>
> As part of this I've massively beefed up the documentation in arm_neon.td
> too.
>
> Things still to do / on the radar:
>  - Testcase generation. This was implemented in the previous version and
> not in
>    the new one, because
>    - Autogenerated tests are not being run. The testcase in test/ differs
> from
>      the autogenerated version.
>    - There were a whole slew of special cases in the testcase generation
> that just
>      felt (and looked) like hacks.
>    If someone really feels strongly about this, I can try and reimplement
> it too.
>  - Big endian. That's coming soon and should be a very small diff on top of
> this one.
>
> Cheers,
>
> James
>
> ---
> include/clang/Basic/arm_neon.td   |  653 +++--
> test/CodeGen/arm64_vcvtfp.c       |    2 +-
> test/Sema/arm-neon-types.c        |    2 +-
> test/Sema/arm64-neon-args.c       |    2 +-
> utils/TableGen/NeonEmitter.cpp    | 4833
> ++++++++++++++-----------------------
> utils/TableGen/TableGenBackends.h |    3 +
> 6 files changed, 2303 insertions(+), 3192 deletions(-)
> <0001-Move-SetTheory-from-utils-TableGen-into-lib-TableGen.patch>
> <0001-Rewrite-ARM-NEON-intrinsic-emission-completely.patch>
_______________________________________________
cfe-commits mailing list
cfe-commits at cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits



-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No:  2548782




More information about the cfe-commits mailing list