[llvm-dev] Adding support for vscale

Tue Oct 1 06:42:25 PDT 2019

(readers note this, copied from the end before writing!
"Given that (2) is a very different use-case, I hope we can keep discussions on
that model separate from this thread, if possible.")

On Tue, Oct 1, 2019 at 12:45 PM Sander De Smalen
<Sander.DeSmalen at arm.com> wrote:

> Thanks @Robin and @Graham for giving some background on scalable vectors and clarifying some of the details!

hi sander, thanks for chipping in.  um, just a point of order: was it
intentional to leave out both jacob and myself?  my understanding is
that inclusive and welcoming language is supposed to used within this
community, and it *might* be mistaken as being exclusionary and
unwelcoming.

if that was a misunderstanding or an oversight i apologise for raising it.

> Apologies if I'm repeating things here, but it is probably good to emphasize
> the conceptually different, but complementary models for scalable vectors:
> 1. Vectors of unknown, but constant size throughout the program.

... which matches with both hardware-fixed per-implementation
variations in potential [max] SIMD-width for any given architecture as
well as Vector-based "Maximum Vector Length", typically representing
the "Lanes" of a [traditional] Vector Architecture.

> 2. Vectors of changing size throughout the program.

...representing VL in "Cray-style" Vector Engines (NEC SX-Aurora, RVV,
SV) and representing the (rather unfortunate) corner-case cleanup -
and predication - deployed in SIMD
(https://www.sigarch.org/simd-instructions-considered-harmful/)

> Where (2) basically builds on (1).
>
> LLVM's scalable vectors support (1) directly. The scalable type is defined
> using the concept `vscale` that is constant throughout the program and
> expresses the unknown, but maximum size of a scalable vector.
> My patch builds on that definition by adding `vscale` as a keyword that
> can be used in expressions.

ah HA!  excccellent.  *that* was the sentence giving the key piece of
information needed to understand what is going on, here.  i appreciate
it does actually say that, "This patch adds vscale as a symbolic
constant to the IR, similar to
undef and zeroinitializer, so that it can be used in constant
expressions" however without the context about what vscale is based
*on*, it's just not possible to understand.

can i therefore recommend a change, here:

"Scalable vector types are defined as <vscale x #elts x #eltty>,
where vscale itself is defined as a positive symbolic constant
of type integer, representing a platform-dependent (fixed but
implementor-specific) limit of any given hardware's maximum
simultaneous "element processing" capacity"

you could add, in brackets, "(typically the SIMD element width)" at
the end there. then, this starts to make sense, but could be further
made explicit:

"This patch adds vscale as a symbolic constant to the IR, similar to
undef and zeroinitializer, so that vscale - representing the
runtime-detected "element processing" capacity - can be used in
constant expressions"

> For this model, predication can be used to disable the lanes
> that are not needed. Given that `vscale` is defined as inherently
> constant and a corner-stone of the scalable type, it makes no
> sense to describe the `vscale` keyword as an intrinsic.

indeed: if it's intended near-exclusively for SIMD-style hardware,
then yes, absolutely.

my only concern would be: some circumstances (some algorithms) may
perform better with MMX, some with SSE, some with different levels of
performance on e.g. AMD or Intel, which would, with benchmarking, show
that some algorithms perform better if vscale=8 (resulting in some
other MMX/SSE subset being utilised) than if vscale=16.

in particular, on hardware which doesn't *have* predication, they're
definitely in trouble if vscale is fixed (SIMD considered harmful).
it may even be the case, for whatever reason, that performance sucks
for AVX512 instructions with a low predicate bitcount, if compared to
using smaller-range SIMD operations, perhaps due to the vastly-greater
size of the AVX instructions themselves.

honestly i don't know: i'm just throwing ideas out, here.

would it be reasonable to assume that predication *always* is to be
used in combination with vscale?  or is it the intention to
[eventually] be able to auto-generate the kinds of [painful in
retrospect] SIMD assembly shown in the above article?

> The other model for scalable vectors (2) requires additional intrinsics
> to get/set the `active VL` at runtime.

ok.  with you here.

> This model would be complementary to `vscale`, as it still requires the
> same scalable vector type to describe a vector of unknown size.

ah.  that's where the assumption breaks down, because of SV allowing
its vectors to "sit" on top of the *actual* scalar regfile(s), we do
in fact permit an [immediate-specified] vscale to be set, arbitrarily,
at any time.

now, we mmmiiiight be able to get away with assuming that vscale is
equal to the absolute maximum possible setting (64 for RV64, 32 for
RV32), then use / play-with the "runtime active VL get/set"
intrinsics.

i'm kiinda wary of saying "absolutely yes that's the way forward" for
us, particularly without some input from Jacob here.

> `vscale` can be used to express the maximum vector length,

wait... hang on: RVV i am pretty certain there is not supposed to be
any kind of assumption of knowledge about MVL.  in SV that's fine, but
in RVV i don't believe it is.

bruce, andrew, robin, can you comment here?

> but the `active vector length` would need to be handled through
> explicit intrinsics. As Robin explained, it would also need Simon Moll's
> vector predication proposal to express operations on `active VL` elements.

ok, a link to that would be handy... let me see if i can find it...
what comes up is this: https://reviews.llvm.org/D57504 is that right?

> > apologies for asking: these are precisely the kinds of
> > from-zero-prior-knowledge questions that help with any review process
> > to clarify things for other users/devs.
> No apologies required, the discussion on scalable types have been going on for quite a while so there are much email threads to read through. It is important these concepts are clear and well understood!

 :)

> > clarifying this in the documentation strings on vscale, perhaps even
> > providing c-style examples, would be extremely useful, and avoid
> > misunderstandings.
> I wonder if we should add a separate document about scalable vectors
> that describe these concepts in more detail with some examples.

it's exceptionally complex, with so many variants, i feel this is
almost essential.

> Given that (2) is a very different use-case, I hope we can keep discussions on
>  that model separate from this thread, if possible.

good idea, if there's a new thread started please do cc me.
cross-relationship between (2) and vscale may make it slightly
unavoidable though to involve this one.

l.