[PATCH][AArch64]RE: patches with initial implementation of Neon scalar instructions

Fri Sep 13 02:02:00 PDT 2013

Hi Kevin,

> From my perspective, DAG should only hold operations with value type, but
> not a certain register class. Which register class to be used is decided by
> compiler after some cost calculation. If we bind v1i32 and v1i64 to FPR,
> then it's hard for compiler to make this optimization.

In an ideal world, I completely agree. Unfortunately the SelectionDAG
infrastructure just doesn't make these choices intelligently. It looks
at each node in isolation and chooses an instruction based on the
types involved. If there were two "(add i64:$Rn, i64:$Rm)" patterns
then only one of them would ever match.

I view this v1iN nonsense as an unfortunate but necessary temporary
measure, until we get our global instruction selection.

I think the only way you could get LLVM to produce both an "add x, x,
x" and an "add d, d, d" from sensible IR without it would be a
separate (MachineInstr) pass which goes through afterwards and patches
things up.

The number of actually duplicated instructions is small enough that
this might be practical, but it would have its own ugliness even if it
worked flawlessly (why v1i8, v1i16 but i32 and i64? There's a good
reason, but it's not pretty).

I'm not implacably opposed to the approach, but I think you'd find
implementing it quite a bit of work. Basically, the main thing I want
to avoid is an int_aarch64_sisd_add intrinsic. That seems like it's
the worst of all possible worlds.

Cheers.

Tim.