[llvm-dev] [RFC] Supporting ARM's SVE in LLVM
Renato Golin via llvm-dev
llvm-dev at lists.llvm.org
Mon Nov 28 01:43:48 PST 2016
On 28 November 2016 at 01:43, Paul Walker <Paul.Walker at arm.com> wrote:
> Reconsidering the above loops with this type system leads to IR like:
>
> (1) <n x 4 x i32> += zext <n x 4 x i8> as <n x 4 x i32> ; bigger_type=i32, smaller_type=i8
> (2) <n x 16 x i8> += <n x 16 x i8>
Hi Paul,
I'm with Mehdi on this... these examples don't look problematic. You
have shown what the different constructs would be good at, but I still
can't see where they won't be.
I originally though that the extended version "<n x m x Ty>" was
required because SVE needs all vector lengths to be a multiple of
128-bits, so they'd be just "glorified" NEON vectors. Without it,
there is no way to make sure it will be a multiple.
> (1) %index.next = add nuw nsw i64 %index, mul (i64 vscale, i64 4)
> (2) %index.next = add nuw nsw i64 %index, mul (i64 vscale, i64 16)
>
> The runtime part of the scalable vector lengths remains the same with the second loop processing 4x the number of elements per iteration.
Right, but this is a "constant", and LLVM would be forgiven by asking
the "size" of it. With that proposal, there's no way to know if that's
a <16 x i8> or <16 x i32>.
The vectorizer concerns itself mostly with number of elements, not raw
sizes, but these types will survive the whole process, especially if
they come from intrinsics.
> As an aside, note that I am not describing a new style of vectorisation here. SVE is perfectly capable of non-predicated vectorisation with the loop-vectoriser ensuring no data-dependency violations using the same logic as for non-scalable vectors. The exception is that if a strict VF is required to maintain safety we can simply fall back to non-scalable vectors that target Neon. Obviously not ideal but it gets the ball rolling.
Right, got that. Baby steps, safety first.
cheers,
--renato
More information about the llvm-dev
mailing list