[PATCH]Clang and AArch64 backend patches to support sshll/ushll instructions

Hao Liu Hao.Liu at arm.com
Mon Aug 19 07:48:39 PDT 2013


Hi Tim,

Actually, this following code is necessary to avoid duplicated name in Macro
expanding: 
	+  case OpMovlHi:
	+    s = TypeString(proto[1], typestr) + " a = __a;\n  " + s;

In arm_neon.h, vshll_n_s8 is a Macro and it already has a variable named
"__a":
	#define vshll_n_s8(a, __b) __extension__ ({ \
  	int8x8_t __a = (a); \
  	(int16x8_t)__builtin_neon_vshll_n_v(__a, __b, 33); })
If vmovl_high is defined as following:
	__ai int16x8_t vmovl_high_s8(int8x16_t __a) {
  	   return (int16x8_t)vshll_n_s8(vget_high_s8(__a), 0); }
There will be an error about the two variable named "__a". One is type
int8x16_t in Function vmovl_high_s8, another is type int8x8_t in Macro
vshll_n_s8. 
As I can't change vshll_n_s8(It must be Macro for the reason of checking the
constant range), I use an local variable "a" in calling Macro vshll_n_s8 to
forbid this conflict as following:
	__ai int16x8_t vmovl_high_s8(int8x16_t __a) {
	   int8x16_t a = __a
  	   return (int16x8_t)vshll_n_s8(vget_high_s8(a), 0); }

However, my patch actually indeed has dead code problems. In arm_neon.h:
	#define vshll_high_n_s8(a, __b) __extension__ ({ \
  	  int8x16_t __a = (a); \
  	  (int16x8_t)vshll_n_s8(vget_high_s8(a), __b); })

The line " int8x16_t __a = (a); " is dead code. But I can't find an easy way
to remove this line, as all the Macros having such local variable __a. So if
we must remove such line, it is better to hack the op name, or we have to
change a lot to achieve this. So I have added the following changes to
remove the dead code:
	-    s += GenMacroLocals(proto, inTypeStr);
	+    if (kind != OpShllHi && kind != OpMovlHi)
	+      s += GenMacroLocals(proto, inTypeStr);

I know hack is not a good way, but actually the old code about variable
naming is not easy to extended. Macro expanding is very troublesome, so
there is no such usage in arm_neon.h, currently they always call inline
functions not Macros. 
However, I don't know whether the hack way is reasonable. If not, maybe we
can just keep the current code and don't change it.


For the long term:
	def SHLL_HIGH_N : SLongInst<"vshlll_high_n", "ndi",
"HcHsHiHUcHUsHUi", "vshll_n">;
is quite different from current code, as currently all Insts are derived
from:
	class Inst <string n, string p, string t, Op o> 
So the 4th parameter must be an Op and no place to add a string. If we are
going to do this, a lot of changes need to be added in NEONEmitter.cpp.

Thanks,
-Hao

-----Original Message-----
From: Tim Northover [mailto:t.p.northover at gmail.com] 
Sent: Monday, August 19, 2013 11:02 AM
To: Hao Liu
Cc: Jiangning Liu; Jiangning Liu; llvm-commits
Subject: Re: [PATCH]Clang and AArch64 backend patches to support sshll/ushll
instructions

Hi Hao,

> I  create a patch to reuse vshll_n and vget_high.
> As a result, there is no built-in
> __builtin_neon_vshll_high_n_v/__builtin_neon_vmovl_high_n_v and no 
> need to handle them in CGBuiltin.cpp.

Thanks very much for taking a look at this. I just meant it as a vague wish
for the future, so seeing a patch was a very nice surprise!

I've just got one issue with the current patch:

+  case OpMovlHi:
+    s = TypeString(proto[1], typestr) + " a = __a;\n  " + s;

What's the purpose of this line? It looks like it just adds dead code to the
definition.

Longer term (and I don't expect you to do this now!), I'm not sure this
approach will scale though. It still seems to require special C++ code to
handle each of the _high_ intrinsics; it just puts that code in NeonEmitter
rather than CGBuiltin.

It might be possible to do this with some kind of modified Inst that just
says "do that other one, but extract the high part of vectors first".

The idea being you could write

def SHLL_HIGH_N : SLongInst<"vshlll_high_n", "ndi", "HcHsHiHUcHUsHUi",
"vshll_n">;

and it would automatically generate the correct delegation. The difficulty
will be doing that without making it too hacky.

Tim.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clang-AArch64Neon-shiftLeft-shiftLong-movLong-v3.patch
Type: application/octet-stream
Size: 5252 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130819/1fe4fb9b/attachment.obj>


More information about the llvm-commits mailing list