<div dir="ltr">Hi Ana,<div><br></div><div>I test your patch on trunk, but have 1 failure in regression test. Can you make a check of your patch?</div><div><br></div><div><div>FAIL: Clang :: CodeGen/aarch64-neon-intrinsics.c (2056 of 15433)</div>
<div>******************** TEST 'Clang :: CodeGen/aarch64-neon-intrinsics.c' FAILED ********************</div><div>Script:</div><div>--</div><div>/home/kevin/llvm_trunk/build/bin/./clang -cc1 -internal-isystem /home/kevin/llvm_trunk/build/bin/../lib/clang/3.4/include -triple aarch64-none-linux-gnu -target-feature +neon -ffp-contract=fast -S -O2 -o - /home/kevin/llvm_trunk/llvm/tools/clang/test/CodeGen/aarch64-neon-intrinsics.c | FileCheck /home/kevin/llvm_trunk/llvm/tools/clang/test/CodeGen/aarch64-neon-intrinsics.c</div>
<div>--</div><div>Exit Code: 1</div><div><br></div><div>Command Output (stderr):</div><div>--</div><div>clang: /home/kevin/llvm_trunk/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp:922: llvm::SDValue llvm::DAGTypeLegalizer::PromoteIntOp_BUILD_VECTOR(llvm::SDNode*): Assertion `!(NumElts & 1) && "Legal vector of one illegal element?"' failed.</div>
</div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">2013/9/18 Ana Pazos <span dir="ltr"><<a href="mailto:apazos@codeaurora.org" target="_blank">apazos@codeaurora.org</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi folks,<br>
<br>
I have rebased my patches now that dependent pending patches are merged.<br>
<br>
I also have made these additional changes:<br>
<br>
1) Adopted the v1ix and v1if solution.<br>
I will revisit it when the "global instruction selection" is in place.<br>
<br>
Tim, can you talk more about this upcoming LLVM change?<br>
a) Will it still be SelectionDAG based?<br>
b) How having whole function knowledge will help me distinguish when<br>
to create Integer and scalar Neon operations without adding the v1x and v1f<br>
types?<br>
<br>
<br>
2) Introduced a new operator OP_SCALAR_ALIAS to allow creating AArch64<br>
scalar intrinsics that are alias to legacy ARM intrinisics.<br>
<br>
Example:<br>
__ai int64_t vaddd_s64(int64_t __a, int64_t __b) {<br>
return (int64_t)vadd_s64((int64x1_t)__a, (int64x1_t)__b); }<br>
<br>
Note that even with this change, the AArch64 intrinisc vaddd_s64 will NOT<br>
generate "add d0, d1, d0" but the optimized code "add x0, x1, x0" because of<br>
the castings to in64_t.<br>
<br>
I experimented with compiling the aarch64-neon-intrinsics.c with -O0 instead<br>
of -O3, but instruction combining pass still makes this optimization.<br>
<br>
So we are really dependent on the compiler optimizations here.<br>
<br>
But note that directly calling ARM legacy intrinsic vadd_s64 produces "add<br>
d0, d1, d0", since the inputs are v1i64 type and I have the proper<br>
instruction selection pattern defined.<br>
<br>
<br>
3) Got rid of int_aarch64_sisd_add(u,s)64 and int_aarch64_sisd_add(u,s)64<br>
intrinsics, as a side-effect of implementing (2).<br>
<br>
Removing these intrinsics we cannot guarantee vaddd_(s,u)64 and<br>
vsubd_(s,u)64 will produce "add/sub d0, d1, d0".<br>
I am allowing these intrinsics to generate Integer code, which is the best<br>
implementation of these intrinsics, as Tim pointed out.<br>
I updated the tests accordingly.<br>
<br>
<br>
4) Used FMOV instead of UMOV to move registers from Neon/integer units when<br>
possible<br>
<br>
For types of size 32 and 64 I tried to make use of FMOV instructions. For<br>
types of size 8 and 16, I make use of the UMOV instructions.<br>
<br>
<br>
Let me know if you have any more comments on these patches.<br>
<br>
Thanks,<br>
Ana.<br>
<br>
-----Original Message-----<br>
From: Tim Northover [mailto:<a href="mailto:t.p.northover@gmail.com">t.p.northover@gmail.com</a>]<br>
Sent: Friday, September 13, 2013 2:02 AM<br>
To: Kevin Qin<br>
Cc: Ana Pazos; <a href="mailto:rajav@codeaurora.org">rajav@codeaurora.org</a>; llvm-commits; <a href="mailto:cfe-commits@cs.uiuc.edu">cfe-commits@cs.uiuc.edu</a><br>
Subject: Re: [PATCH][AArch64]RE: patches with initial implementation of Neon<br>
scalar instructions<br>
<br>
Hi Kevin,<br>
<br>
> From my perspective, DAG should only hold operations with value type,<br>
> but not a certain register class. Which register class to be used is<br>
> decided by compiler after some cost calculation. If we bind v1i32 and<br>
> v1i64 to FPR, then it's hard for compiler to make this optimization.<br>
<br>
In an ideal world, I completely agree. Unfortunately the SelectionDAG<br>
infrastructure just doesn't make these choices intelligently. It looks at<br>
each node in isolation and chooses an instruction based on the types<br>
involved. If there were two "(add i64:$Rn, i64:$Rm)" patterns then only one<br>
of them would ever match.<br>
<br>
I view this v1iN nonsense as an unfortunate but necessary temporary measure,<br>
until we get our global instruction selection.<br>
<br>
I think the only way you could get LLVM to produce both an "add x, x, x" and<br>
an "add d, d, d" from sensible IR without it would be a separate<br>
(MachineInstr) pass which goes through afterwards and patches things up.<br>
<br>
The number of actually duplicated instructions is small enough that this<br>
might be practical, but it would have its own ugliness even if it worked<br>
flawlessly (why v1i8, v1i16 but i32 and i64? There's a good reason, but it's<br>
not pretty).<br>
<br>
I'm not implacably opposed to the approach, but I think you'd find<br>
implementing it quite a bit of work. Basically, the main thing I want to<br>
avoid is an int_aarch64_sisd_add intrinsic. That seems like it's the worst<br>
of all possible worlds.<br>
<br>
Cheers.<br>
<br>
Tim.<br>
<br>_______________________________________________<br>
cfe-commits mailing list<br>
<a href="mailto:cfe-commits@cs.uiuc.edu">cfe-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div dir="ltr">Best Regards,<div><br></div><div>Kevin Qin</div></div>
</div>