[PATCH] implement 3 aarch64 neon instrunctions (umov smov ins) in llvm

Fri Sep 13 00:44:21 PDT 2013

Hi Kevin,

> I see many developers use llvm-reviews.chandlerc.com for review, so I upload my patch there.

Good idea, I quite like the software when you get used to it. One
thing: it's best to put llvm-commits in the CC list, otherwise most
people will have no idea the patch is there.

A few more comments:

> setOperationAction(ISD::EXTRACT_VECTOR_ELT, MVT::v8i8, Custom);

I'm hoping these won't be necessary any more (in fact they're the main
reason I decided to implement the RegisterOperand change when I did) .
Now that VPR64 and VPR128 are more sanely related, you should be able
to write patterns for this instead of the custom lowering.

>  BuildMI(MBB, I, DL, get(AArch64::INSsw), DestReg)
>          .addReg(DestReg)

I still think the separate NEON instructions are a mistake. Last time
you asked for solid evidence, and I obviously couldn't say much. I've
run a set of tight loops exercising the product { NEON, Scalar, Mixed}
x { FMOV, INS }.

As expected, mixing NEON and scalar instructions made essentially no
difference but using INS instead of FMOV slowed the program down
(massively, until I decided to be kind and break the artificial
register-dependency introduced, then only slightly).

> multiclass Neon_SMOV_pattern2 <RegisterOperand OpVPR, ValueType OpTy,
>                               Operand OpImm, Instruction SMOVI> {
 >def : Pat<(i64 (sext

There's no need for a multiclass with just one member. You can use a
class instead.

Cheers.

Tim.