[llvm-dev] [RFC] Supporting ARM's SVE in LLVM

Mon Nov 28 06:36:49 PST 2016

On 28 November 2016 at 12:02, Paul Walker <Paul.Walker at arm.com> wrote:
> (1)     for (0..N) { int64s[i] += bytes[i]; }  ==> <n x i64> += zext <????? x i8> as <n x i64>
>
> This interpretation falls down at the IR level.  If <n x i8> represents a vector full of bytes, how do you represent a vector that's an 8th full of bytes ready be zero-extended.

Right, of course! A <n x i8> vector can be on any number of lanes.

So, for vscale = 4, <4 x 4 x i8> would use 16 lanes (out of a possible
64), while <4 x 16 x i8> would use all 64 lanes. The instructions that
are needed are also different: an extend + copy or just a copy.

All that matters here is the actual number of lanes, which is directly
obtained by (n * m) from <n x m x Ty>. If the number of lanes is
different, and the types can be converted (extend/truncate), then
you'll need additional pre-ops to fudge the data between the moves /
ops.

I think I'm getting the idea, now. :)

cheers,
--renato