[llvm-dev] [RFC] Vector Predication
Bruce Hoult via llvm-dev
llvm-dev at lists.llvm.org
Thu Jan 31 16:38:52 PST 2019
On Thu, Jan 31, 2019 at 9:03 AM Philip Reames via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Question 1 - Why do we need separate mask and lengths? Can't the length be easily folded into the mask operand?
RISC-V has both masks and an active vector length and the semantics
are different.
TLDR: Masked-off elements in the destination register retain their
previous value, but elements past the active vector length are zeroed.
I'll quote from the current version of the draft spec:
==========
5.4. Active and Tail Element Definitions
The elements within a vector instruction can be divided into four
disjoint subsets.
The prestart elements are those whose element index is less than the
initial value in the vstart register. The prestart elements do not
raise exceptions and do not update the destination vector register.
The active elements during a vector instruction’s execution are the
elements within the current vector length setting and where the
current mask is enabled at that element position. The active elements
can raise exceptions and update the destination vector register group.
The inactive elements are the elements within the current vector
length setting but where the current mask is disabled at that element
position. The inactive elements do not raise exceptions and do not
update the destination vector register.
The tail elements during a vector instruction’s execution are the
elements past the current vector length setting. The tail elements do
not raise exceptions, but do zero the results in the destination
vector register group.
for element index x
prestart = (0 <= x < vstart)
active(x) = (vstart <= x < vl) && (unmasked || mask(x))
inactive(x) = (vstart <= x < vl) && !(unmasked || mask(x))
tail(x) = (vl <= x < VLMAX)
All vector instructions place zeros in the tail elements of the
destination vector register group. Some vector arithmetic instructions
are not maskable, so have no inactive elements, but still zero the
tail elements.
==========
Note: vstart is almost always zero, exists to support interruptable
vector instructions. "The vstart CSR is writable by unprivileged code,
but non-zero vstart values may cause vector instructions to run
substantially slower on some implementations, so vstart should not be
used by application programmers."
More information about the llvm-dev
mailing list