Tue Jun 12 10:21:06 PDT 2018

Just to clarify the part below about changing the proposal, this means that a <scalable n x ty> vector:

* Has a size of vscale * sizeof(ty) when reasoning about sizes within llvm
* Potentially has a smaller size of active_vscale * sizeof(ty) depending on the current
  processor state at runtime.

For SVE, active_vscale == vscale. For RVV (and potentially others), it can be smaller based
on the VL state generated by setvl or similar.


I am not quite so sure about turning the active vector length into
just another mask. It's true that the effects on arithmetic, load,
stores, etc. are the same as if everything executed under a mask like
<1, 1, ..., 1, 0, 0, ..., 0> with the number of ones equal to the
active vector length. However, actually materializing the masks in the
IR means the RISCV backend has to reverse-engineer what it must do
with the vl register for any given (masked or unmasked) vector
operation. The stakes for that are rather high, because (1) it applies
to pretty much every single vector operation ever, and (2) when it
fails, the codegen impact is incredibly bad.

I can see where the concern comes from; we had problems reconstructing
semantics when experimenting with search loop vectorization and often
had to fall back on default (slow) generic cases.

My main reason for proposing this was to try and ensure that the size was
consistent from the point of view of the query functions we were discussing
in the main thread. If you're fine with all size queries assuming maxvl (so
things like stack slots would always use the current configured maximum
length), then I don't think there's a problem with dropping this part of the
proposal and letting you find a better representation of active length.

