[PATCH] D69891: [VP,Integer,#1] Vector-predicated integer intrinsics

Tue Feb 4 02:20:34 PST 2020

simoll marked 2 inline comments as done.
simoll added a comment.

In D69891#1856146 <https://reviews.llvm.org/D69891#1856146>, @andrew.w.kaylor wrote:

> In D69891#1854113 <https://reviews.llvm.org/D69891#1854113>, @SjoerdMeijer wrote:
>
> > > I'm not entirely clear where this value comes from, but it seems like whatever generated it must know that we're generating SVE code, right?
> >
> > This is not architecture specific, and thus this is not assuming SVE. In the case of SVE, the vector length is unknown at compile time (it is a multiple of something), and constant at run-time. In the RISC-V vector extension, I believe it can be changed at any point. Thus, the way to obtain this value is by reading/writing a status register, or something similar, but relying on querying the architecture features. And whether it is constant at runtime, or can be changed at any point, this API seems to cover the different cases.
>
>
> Alright, I was reaching for a more general concept that I obviously don't know the term for. What I meant to say was just that whatever generated the argument must know something specific about the target architecture.
>
> In D69891#1854301 <https://reviews.llvm.org/D69891#1854301>, @simoll wrote:
>
> > Operand bundles are not the same as optional parameters. Operand bundles pessimistically imply side-effects where as a plain `i32` argument is innocuous to optimizations:
> >
> > > https://llvm.org/docs/LangRef.html#operand-bundles
> > > 
> > > - Calls and invokes with operand bundles have unknown read / write effect on the heap on entry and exit (even if the call target is readnone or readonly), unless they’re overridden with callsite specific attributes.
> >
> > That is reasonable for constrained fp intrinsics, which always have such a dependence. Many VP intrinsics however do not have any side effects or only inspect/modify memory through their pointer arguments, etc.
>
>
> We could add a callsite attribute when the intrinsic is created. We have that information about the intrinsic, right?
>
> I'll admit that there may be no practical difference between what you've proposed and what I have in mind (in the abstract sense that assumes it's even possible to implement this in the way that I'm imagining). Mostly, I'm trying to explore what I perceive to be a bad smell in the IR. Having a parameter that is irrelevant for some targets just doesn't seem right. Introducing these intrinsics with "experimental" in the name would make me feel a bit better about that, even though again it has no practical effect.
>
> In D69891#1854257 <https://reviews.llvm.org/D69891#1854257>, @rkruppe wrote:
>
> > The real question is, do targets without native support for EVL have anything to gain from making the argument optional?
>
>
> I'm not sure I agree that is the real question. What is gained by not having the floating point constraint arguments always present even when they aren't needed? We have values that describe the default mode, so we could use those when we want the default mode. I think we all agree that they shouldn't be there when we don't need them, yet I would argue that those arguments are more relevant when "not needed" than the evl argument is.
>
> I'm imagining walking someone through the IR, explaining what each instruction means. I'd be a bit embarrassed when I get to a vp intrinsic and say, "Ignore the last argument. We don't use it for this target."
>
> But like I said above, this is much more a vague indication to me that we haven't got this right yet than a concrete problem.

To me all of this points in the direction that we'd want a standard mechanism for optional parameters. We have three options for this right now:
a) Come up with a novel, clean slate scheme for optional parmeters with default values (eg constrained fp default fp env values, `i32 -1` for VP intrinsics). Some examples:

  ; all optional parameters set (eg RISC-V V or SX-Aurora)
  %x= llvm.fp.add(%a, %b) mask(%mask), evl(%evl), fpenv(fpround.tonearest, fpexcept.strict)

  ; constrained fp SIMD intrinsic (eg AVX512)
  %y= llvm.fp.add(%a, %b) mask(%mask), fpenv(fpround.tonearest, fpexcept.strict)

  ; unpredicated, default fp env fp add (isomorphic to the fadd instruction)
  %z= llvm.fp.add(%a, %b)

This is the solution that i'd want to see in LLVM to solve the optional parameter problem once and for all for constrained fp *and* VP (we should look out for other potential applications and start an RFC).

b) OpBundles (imply side-effects unless overriden with callsite attributes) - clunky, could form the starting point for a) though.
c) A more exotic solution: add an extra  opaque parameter and provide different intrinsics to produce the opaque value (i'm not aware of this being used anywhere right now). This is similar to the idea that @sdesmalen brought up (https://reviews.llvm.org/D69891#inline-632851), however, with the produced mask having an opaque type such that it behaves basically like an optional parameter tuple, ie:

  declare opaquety  @llvm.vp.mask(<8 x i1>)
  declare opaquety @llvm.vp.evlmask(<8 x i1>, i32 %evl)
  declare <8 x float> @llvm.vp.fadd(%a, %b, opaquety mask)

For constrained fp, this could look as follows

  declare opaquety @llvm.constrainedfp.fpenv(metadata, metadata)
  declare opaquety @llvm.constrained.fadd(%a, %b, opaquety fpenv)

Note that we could allow chaining of these intrinsics (eg add a passthru `opquety` parameter to the defining intrinsics):

  opaquety %fpenv = llvm.constrainedfp.fpenv(opquety undef, fpround.toneares, fpexcept.strict)
  opaquety %opt_params = llvm.vp.evlmask(%fpenv, <8 x ii1>%mask, i32 %evl)
  %c = llvm.fadd(%a, %b,  opt_params)

BUT before any of these solutions are adopted, i'd rather we model the optional parameters as standard, explicit parameters with default values. We can pursue tha general solution for the optional parameter problem in parallel.
What do you think?

================
Comment at: llvm/docs/LangRef.rst:15081
+is a signed integer value.  The explicit vector length (%evl) is only effective
+if it is non-negative, and when that is the case, its value is in the range:
+
----------------
andrew.w.kaylor wrote:
> Nothing in this description makes it clear that the value should be negative for targets that don't support the explicit vector length argument.
> 
> What should happen if this value isn't negative for a target that doesn't support it?
> Nothing in this description makes it clear that the value should be negative for targets that don't support the explicit vector length argument.
Good point. I'll add that to the description.

> What should happen if this value isn't negative for a target that doesn't support it?
There is a lowering pass that folds `%evl` into the mask, just as for reduction intrinsics.

================
Comment at: llvm/docs/LangRef.rst:15085
+
+      0 <= %evl <= W,   where W is the vector length.
+
----------------
andrew.w.kaylor wrote:
> What happens if the value isn't in this range? Based on the comments below it seems that the effective mask created should be as if %evl were equal to W. If it is a non-constant value will the generated code somehow enforce that behavior?
We are discussing the case `%evl >= W` in the main RFC, right now https://reviews.llvm.org/D57504#1854330 .

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69891/new/

https://reviews.llvm.org/D69891