[llvm-dev] Questions about vscale

Tue Apr 7 20:23:07 PDT 2020

Thanks, Hanna.

On Tue, Apr 7, 2020 at 7:51 PM Hanna Kruppe <hanna.kruppe at gmail.com> wrote:

> Hi all,
>
> On Tue, 7 Apr 2020 at 11:04, Renato Golin via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> >
> > On Tue, 7 Apr 2020 at 09:30, Kai Wang via llvm-dev
> > <llvm-dev at lists.llvm.org> wrote:
> > >           LMUL = 1           LMUL = 2            LMUL = 4
> LMUL = 8
> > > int64_t | vscale x 1 x i64 | vscale x  2 x i64 | vscale x  4 x i64 |
> vscale x  8 x i64
> > > int32_t | vscale x 2 x i32 | vscale x  4 x i32 | vscale x  8 x i32 |
> vscale x 16 x i32
> > > int16_t | vscale x 4 x i16 | vscale x  8 x i16 | vscale x 16 x i16 |
> vscale x 32 x i16
> > >  int8_t | vscale x 8 x i8  | vscale x 16 x i8  | vscale x 32 x i8  |
> vscale x 64 x i8
> > >
> > > We have another architecture parameter, ELEN, which means the maximum
> size of a single vector element in bits.
> >
> > Hi,
> >
> > For my own education, some quick questions:
> >
> > 1. is LMUL always a multiple of ELEN?
>
>
> This happens to be true (at least in the current spec, disregarding
> some in-progress proposals) just because both are powers of two and
> the largest possible LMUL equals the smallest possible ELEN (8), but I
> don't think there is any meaning to be found in this observation. The
> two values govern unrelated aspects of the vector unit.
>
> > 2. Is this fixed on the hardware, depending on the actual lengths, or
> > is this dynamically set by software (on a register or status flag)?
> > 2a. If dynamic, can it change from program to program? Function to
> function?
>
>
> It's not clear whether by "this" you mean ELEN, LMUL, or something
> else. ELEN is fixed in hardware. LMUL is a property of each individual
> instruction. Most instructions take it from a control register, a few
> encode it in the instruction as an immediate, but in any case it needs
> to be statically determined (on a per-instruction basis) to be able to
> allocate registers. This is not just a constraint for
> compiler-generated code, but also for all hand-written assembly code
> I've seen or can imagine.
>
> >
> > > We hope the type system could be consistent under ELEN = 32 and ELEN =
> 64. However, vscale may be a fractional value under ELEN = 32 in the above
> type system. When ELEN = 32, i64 is an invalid type (we could ignore the
> first row for ELEN = 32) and vscale may become 1/2 on run time to fit the
> architecture (if the vector register only has 32 bits).
> >
> > Do you mean ELEN=32 like this?
> > int32_t | vscale x 1 x i32 | vscale x 2 x i32 | vscale x  4 x i32 |
> > vscale x  8 x i32
> > int16_t | vscale x 2 x i16 | vscale x 4 x i16 | vscale x  8 x i16 |
> > vscale x 16 x i16
> >   int8_t | vscale x 4 x i8   | vscale x 8 x i8   | vscale x 16 x  i8 |
> > vscale x 32 x i8
> >
> > If the type is invalid, you would need to legalise it, and in that
> > case create some cluttered accessors (via insert/extract element) and
> > possibly use intrinsics to expose underlying instructions that can
> > deal with it.
> >
> > Perhaps I'm not clear on what you need, but vscale is supposed to be
> > the number of valid elements (lanes), and given i64 is invalid, vscale
> > wouldn't apply?
>
>
> I don't know what "vscale wouldn't apply" is supposed to mean. Whether
> it's legal or not, you can write LLVM IR using (for example) the type
> <vscale x 1 x i64> even if the target doesn't natively support it. The
> purpose of legalization is to make sure that results in the behavior
> the type is supposed to have. For <vscale x 1 x i32>, this means among
> other things:
>
> - it has the same number of elements as <vscale x 1 x i32>, but each
> element is twice as big
> - it has half as many elements (each of the same size) as <vscale x 2 x
> i64>
> - its total size in bits is the same as <vscale x 2 x i32>
>
> I think that focusing on the completely illegal i64 might obscure the
> real problem I see with the fractional vscale concept. Let's look at
> <vscale x 1 x i32> instead. The elements are clearly legal in this
> context, even in some vector types, but the <vscale x 1 x i32> type is
> absent from Kai's table. This makes sense: the same vector register
> fits 2x as many i32 elements as i64 elements, so if you start with
> <vscale x 1 x i64> mapping to a single register, then <vscale x 2 x
> i32> is the same size and fits in the same register class, while
> <vscale x 1 x i32> is too small and must be legalized somehow.
>
> But how? If we take Kai's table as gospel and look at a VLEN = ELEN =
> 32 machine, the vector type <vscale x 2 x i32> is supposed to map to a
> single vector register, which is 32b small, and thus <vscale x 2 x
> i32> would have just one element in this context (matching the "vscale
> = 1/2" intuition). To be consistent with this, <vscale x 1 x i32>
> would have be contain just *half* an element. This is not something
> any legalization strategy can achieve, because it is a fundamentally
> impossible notion. So we end up in a situation where some types are
> not just illegal and have to be legalized, but are contradictory and
> can't be legalized in any meaningful way.
>
> I don't think LLVM can/should support this kind of contradiction. Some
> types have to be legalized, sometimes the legalization is not
> efficient, sometimes it's not even implemented, that's all fine. But
> letting some targets decide that <vscale x 1 x i32> is a fundamentally
> impossible type to even assign a meaning to... that seems
> unprecedented and contrary to the philosophy of LLVM IR as reasonably
> target-independent IR.
>

If we apply the type system pointed out by Renato, is the vector type
<vscale x 1 x i16> legal? If we decide that <vscale x 1 x i16> is a
fundamentally impossible type, does it contrary to the philosophy of LLVM
IR as reasonably target-independent IR? I do not get the point of your
argument.

>
> The obvious solution is to use a different set of legal vector types
> (and thus, a different interpretation of vscale) depending on the
> largest legal element type (ELEN in RISC-V jargon). This corresponds
> to the table for ELEN=32 that Renato gave above. Kai's proposal is
> intended to avoid this, and I can understand the desire for that, but
> it really seems like the lesser evil to me.
>

The problem of defining a different type system depending on the largest
legal element type (ELEN in RISC-V jargon) is that they are not compatible.
I assume that programs compiled under ELEN = 32 could be run on ELEN = 64
machines. It should be possible to link ELEN = 32 objects with ELEN = 64
objects. If we use the type <vscale x 1 x i32> under ELEN = 32, there is no
corresponding type under ELEN = 64 for <vscale x 1 x i32> (look up in my
table). It seems an illegal type under ELEN = 64. Does it follow the
philosophy of target independent IR?

I hope we could design an unified type system for different ELEN. However,
the vscale may be fractional on run time under some circumstances (VLEN =
32, ELEN = 32) in my proposal. That is why I wonder to know whether the
fractional vscale is matter or not.

Thanks,
Kai

> Best regards
> Hanna
>
>
> > > Is there any problem to assume vscale to be fractional under some
> circumstances? vscale should be an unknown value when compiling. So, it
> should have no impact on code generation and optimization. The relationship
> between types is correct regardless vscale’s value. Is there anything I
> missed?
> >
> > I believe the assumption was always that vscale is an integer.
> > Representing it as a fraction would need code change for sure, but
> > also reevaluate the assumptions.
> >
> > I'm copying some SVE and LV people to give a more informed opinion.
> >
> > cheers,
> > --renato
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200408/6a76ea7d/attachment-0001.html>