[llvm-dev] RFC: token arguments and operand bundles

Mon Nov 18 10:46:12 PST 2019

> so there is a maintenance overhead to prematurely enable a feature that
is unclear users will ever be asking for.
I'm imagining it's just a `ConstantInt *fpround =
dyn_cast<ConstantInt>(arg_end()[-2])` whenever you want to inspect it, and
then treats it as dynamic/strict if it's not visible. Are you thinking it's
something more? There's also a maintenance overhead on front-end
maintainers if it is more restrictive than required, so take this as a
request for it to be permissive on behalf of future experimental users :)

> The exception/rounding params of constrained fp ops in LLVM are, in
contrast, hints to the optimizer to enable certain optimizations in
non-default fp envs.

Yes, but it sounds like they are a specification to the optimizer to help
it understand what the runtime semantics will be (e.g. making the implicit
config register explicit). I'm not sure I see what distinction you're
making, although I don't want to be a spec-lawyer.

>  I don’t want these values to be part of a use-def chain, though perhaps
Jameson does want that.

Partially. I want them to be valid syntax to be in a use-def chain, even if
they don't do anything unless there's an actual constant that gets inserted
there. I want that because it should allow optimizing strictly more
programs than any proposal which is only capable of representing constants
(and because I think it seems less complicated).

By making it accepting of a value, what optimization do you lose? It still
should be able to do all of the optimizations available to Constants now
(by checking if it does have a constant). But now I’d expect you can also
optimize any program where the value became a constant only because of
previous optimization passes—a benefit you’d just get for free from the
existing passes! I know C99 was conservative here, and can’t explicitly
represent the concept of the “fpenv” or “fpexcept” state being a specific
variable (just a pragma for on and off, and the compiler-specific fp-model
command line options). But as this work as progressed, I'm hoping it'll
start to make the feature more accessible to more people (even if the total
number of people that will ever numerically care about the difference is
tiny).

I agree an operand bundle could be used. I suppose those might be rather
like making it a keyword argument: still just conceptually a regular Use in
the IR, just enumerated in a different list (by tag-name instead of order)?
It doesn’t seem apparent to me why it would be better, unless we want it to
eventually be meaningful on any call in addition to supporting passing in
values?

But Joseph’s description of it makes me think of a general purpose call /
function attribute (defined as giving the fp settings at entry). I can
hypothesis what that attribute could mean when applied to an arbitrary
function or call, though I’m not sure what frontend would find it useful to
emit it. I think that implementation could be capable of maintaining the
same information (and get a constant name shown in the IR printer), so it
seems doable, although perhaps not quite trivial to implement in LLVM
(because the inlining pass would need to be taught all about it, and
there’s a slightly higher risk some pass might decide to drop all metadata
and lose it).

To attempt to summarize, since I’ve been attempting to make a number of
philosophical points here (each of these written as pros for making it a
variable Argument vs. requiring it to be a constant string):
1. Most importantly I think it makes sense because lowering seems to
already have support for it (because it’s equivalent to dynamic and
strict), then in no particular order:
2. I think it seems like it could be a better model for the implementation
in physical hardware: this layer is pretty limited in options, so it has to
be just a register somewhere/somehow. Yeah, it’s often just got a weird
name and nemonic for ld/st, and it just shows up as an implicit argument
when needed (plus, it doesn’t usually get modeled as such by the register
allocator). But still basically just a regular Argument.
3. I think it's a slightly better model of the IEEE standard (or at least
the two paragraphs I bothered to read): it says the hardware must support
the value being given in a variable. It'd be a somewhat different situation
if the hardware needed to select different instructions to support each
mode (didn't support dynamic mode).
4. I still think it'd be simpler overall, but especially for front-ends: it
doesn’t need a special side-channel in the front-end, or a legalization
pass to know which constants are known to the optimization passes. Custom
pretty-printers are a bit of extra work, true, but that isolates the
complexity in one component in the backend (and apparently might find
itself with others users).
5. It’s more powerful (supports more optimizations): having them as regular
Uses means that, in addition to constant strings typed in the source code
or given on the command line, it can also represent variables that other
optimization passes might turn into constants. (I generally also am wary of
immArg for this reason too, although I recognize that there are—and I’ve
written— cases where dynamic support really isn’t possible or easy and
immArg is required. But in this particular case, the dynamic/strict support
appears to already exists as part of the relevant standards, so I'm mainly
asking why it shouldn't be exposed that way at the IR level too.)

-jameson

On Mon, Nov 18, 2019 at 11:56 AM Simon Moll <Simon.Moll at emea.nec.com> wrote:

> Hi Jameson,
>
> On 11/16/19 5:46 PM, Jameson Nash via llvm-dev wrote:
>
> Hi Simon! Yep, I too much appreciate that you asked for discussion. I'm
> hope I'm not giving you more than you bargained for! (I likely wouldn't
> have seen this otherwise until the next LLVM release)
>
> Well, if it makes people think about the constrained fp design i'd say
> it's worth it. Constrained fp doesn't exactly get the crowds excited ;-)
>
>
> Yep, I think we're on the same page about the expectations of these
> operands. I'm aware that the user and/or front-end needs to also explicitly
> set the state.
>
> Yes, we use IRBuilder. But that's not problem. The problem is that it
> assumes that all front-ends want to maintain this information as global
> lexical state until llvm lowering. That's OK for clang, since it doesn't
> currently do optimizations on an intermediate IR. But I'm arguing that it'd
> be easier for other front-ends to pick up this work too if LLVM uses the
> regular call-argument channel for this information. Currently, it seems it
> expects that all front-ends will do some sort of legalization and move this
> information into some sort of a side-channel (as we know that C99 currently
> specifies). That's doable, it'd just be nicer if that was buried inside the
> llvm optimization passes that already know about it.
>
> When a frontend wants to emit fp instructions that may operate in a
> non-default fp env, they can set the IRBuilder flags to"fpexcept.strict'
> "fpround.dynamic". You can have an analysis that later refines those values
> where possible (eg by inspecting calls to "fesetround" or "fesetexceptflag"
> or whatever libcall that particular language uses), including lowering to
> default-env fp ops.
>
>
> >  As such there is little value in allowing variable exception/rounding
> mode params for the contrained fp intrinsics - LLVM passes wouldn't
> understand them (in all but trivial cases) and would have to assume the
> worst case (dynamic/strict) anyway.
>
> We may have to agree to disagree here, but this is exactly what I'm asking
> for LLVM to do. I don't want LLVM to complicate life just because all
> optimizations might not be applicable sometimes. There's lots of
> optimizations that might not be applicable, and I want to avoid coding in
> the exceptions in my frontend if I don't need to.
>
> I'm not familiar enough with IEEE 754 to know if it specifies behaviors
> for the representation in the middle-end. I thought it talked more about
> allowable optimizations and required features than specific representation
> questions. Cherry-picking text, perhaps the closest to my argument would be
> the sentence in 4.2 that the "user can specify that the attribute parameter
> [from a] variable." It's not really saying that you have to be able to pass
> this in as a variable, but I'm going to pretend that I can make that claim
> anyways, haha.
>
> Yes, that's for specifying the fp env..
>
> Section 4.1: "An attribute is logically associated with a program block to
> modify its numerical and exception semantics."
>
> The exception/rounding params of constrained fp ops in LLVM are, in
> contrast, hints to the optimizer to enable certain optimizations in
> non-default fp envs.
>
>
> > uses the information it finds there to change the rounding mode operand
> in the first fadd to rmTowardZero
>
> That sounds neat. I don't think it should conflict with passing in the
> mode as a variable though. Since the langref says the value must agree with
> the current mode, I think it'd still be legal optimization to replace the
> argument value with a more precise one (either Value->Constant, or
> dynamic->known, or perhaps even replace it with undef if it sees that the
> mode must be wrong). If you don't think that's legal though, that would
> lend credibility to the need to use a custom token representation in an
> operand-bundle.
>
> Also, I thought Simon seemed to be saying this analysis pass wouldn't be
> legal ("there is no going back from dynamic fpexcept/fpround to
> constants"), but I think I must have misunderstood him, since I don't think
> that's what he meant.
>
> Sure, an analysis that understands the fp environment and refines the
> except/rounding parameters is possible (eg round.dynamic > round.tonearest
> when all reaching control paths configure the fpenv that way). What i was
> saying is that from the moment we allow variable except/rounding params in
> constrained fp intrinsics - syntactically in the IR - we'd have to support
> that.. so there is a maintenance overhead to prematurely enable a feature
> that is unclear users will ever be asking for.
>
>
> > To me it seems constant tokens do the job just fine without the need for
> a custom code path.
>
> I guess I would have called custom tokens a custom code path too. I don't
> think it's a question about whether we need represent this custom
> information—we clearly do—just at which places in the pipelines and
> representations should there be customizations to hold the information.
>
> Yep. Still, i think tokens constants are less invasive since then the
> customization is confined to the 'token' type as a kind of compiler-builtin
> enum type.
>
>
> Best,
> Jameson
>
> Best
>
> - Simon
>
>
> On Fri, Nov 15, 2019 at 3:17 PM Kaylor, Andrew <andrew.kaylor at intel.com>
> wrote:
>
>> We really have been trying to keep in mind that LLVM needs to support
>> multiple front ends, which may be implementing different language
>> standards. As much as possible, I’ve been trying to let the IEEE 754 spec
>> drive my thinking about this, though I’ll admit that on a few points I’ve
>> use the C99 spec as a sort of reference interpretation of IEEE 754.
>>
>>
>>
>> LLVM’s IRBuilder has been recently updated to provide an abstraction
>> layer between front ends and the optimizer. So, if you’re using IRBuilder,
>> you set need to call setIsFPConstrained() then, optionally,
>> IRBuilder::setDefaultConstraiedExcept() and/or
>> setDefaultConstrainedRounding(). After that, calls to something like
>> IRBuilder::CreateFAdd() will automatically create the constrained intrinsic
>> with the appropriate constraints, regardless of how we end up representing
>> them. If your front end isn’t using IRBuilder, I will admit it gets a bit
>> more complicated.
>>
>>
>>
>> I wouldn’t be opposed to a solution that involved a custom printer for
>> these arguments, but I don’t think it really adds anything that we wouldn’t
>> get from using tokens as I have proposed. Likewise with the named constant
>> idea. On the other hand, if I’m misusing tokens then maybe what constants
>> would add is a way to avoid that misuse.
>>
>>
>>
>> Regarding the question of what is exposed to users and how, that’s mostly
>> up to the front end. I would like to clarify how we intend for this to
>> work, in general. Simon touched on this briefly, but I’d like to be a bit
>> more verbose to make sure we’re all on the same page.
>>
>>
>>
>> There are effectively two distinct modes of source code translation to IR
>> with respect to floating point operations -- one where the user is allowed
>> to modify the floating point environment and one where they are not. This
>> may not have been clear to everyone, but by default LLVM IR carries with it
>> the assumption that the runtime rounding mode is “to nearest” and that
>> floating point operations do not have side effects. This was only
>> documented recently, but this is the way the optimizer has always behaved.
>> In this default mode, the IR *shouldn’t* change the floating point
>> environment. I would encourage front ends to document this more
>> specifically saying that the user is not permitted to change the FP
>> environment.
>>
>>
>>
>> This leads to the necessity of a second state in which the optimizer does
>> not assume the default rounding mode and does not assume that floating
>> point operations have no side effects. Proscribing these assumptions limits
>> optimization, so we want to continue allowing the assumptions by default.
>> The state where the assumptions are not made is accomplished through the
>> use of constrained intrinsics. However, we do not wish to completely
>> eliminate optimizations in all cases, so we want a way to communicate to
>> the optimizer what it can assume. That is the purpose of the fpround and
>> fpexcept arguments. These are not intended to control the rounding mode or
>> exception reporting. They only tell the compiler what it can assume.
>>
>>
>>
>> Understanding this, front ends can control these in any way they see fit.
>> For instance, the front end might have a global setting the changes the
>> rounding mode to “toward zero.” In that case, it would create constrained
>> intrinsics for all FP operations and set the rounding mode argument
>> (however we end up representing in) to rmTowardZero (a constant currently
>> defined by LLVM corresponding to the “fpround.towardzero” metadata
>> argument).  Then the optimizer can use this information to perform
>> optimizations like constant folding.
>>
>>
>>
>> Runtime changes to the rounding mode are a separate matter. As I said
>> above, I think front ends should define clear circumstances under which
>> such changes are permitted, but the mechanism for making such changes is
>> independent of the constrained FP intrinsics. For example, consider the
>> following C function.
>>
>>
>>
>> double foo(double A, double B, double C) {
>>
>>   int OrigRM = fegetround();
>>
>>   fesetround(FE_TOWARDZERO);
>>
>>   double tmp = A + B;
>>
>>   fesetround(OrigRM);
>>
>>   return tmp + C;
>>
>> }
>>
>>
>>
>> Assuming the compiler was in a state where it knew fenv access was
>> enabled, I would expect that to get translated to something like this
>> (after SROA cleanup):
>>
>>
>>
>> define double @foo(double %A, double %B, double %C) {
>>
>>   %orig.rm = call i32 @fegetround()
>>
>>   %ignored = call i32 @fesetround(i32 3072)
>>
>>   %tmp = call double @llvm.experimental.constrained.fadd(double %A,
>> double %B) [ “fpround”(token rmDynamic), “fpexcept”(token rmStrict) ]
>>
>>   %ignored = call i32 @fesetround(i32 %orig.rm)
>>
>>   %result = call double @llvm.experimental.constrained.fadd(double %tmp,
>> double %C) [ “fpround”(token rmDynamic), “fpexcept”(token rmStrict) ]
>>
>> }
>>
>>
>>
>> Notice here the literal constant that C defines is still used for the
>> call to fesetround(FE_TOWARDZERO), and the variable is used for the call
>> that restores the rounding mode. Also notice that in both fadd operations,
>> the rounding mode is declared as rmDynamic. I have an idea that we ought to
>> have a pass that recognizes the fesetround library call an uses the
>> information it finds there to change the rounding mode operand in the first
>> fadd to rmTowardZero, but the front end won’t be expected to do that. We’ll
>> probably want an intrinsic to change the rounding mode so that we don’t
>> need to recognize all manner of language-specific libcalls, but that’s a
>> problem for later.
>>
>>
>>
>> I hope this has been more helpful than tedious. Also, I feel like I
>> should reiterate that I am still seeking all opinions about the use of
>> tokens and operand bundles or any other means of representing the fp
>> constraints. I just want to make sure that we all have the same
>> understanding of what the information I’m trying to represent in IR means.
>>
>>
>>
>> -Andy
>>
>>
>>
>> *From:* Jameson Nash <vtjnash at gmail.com>
>> *Sent:* Thursday, November 14, 2019 8:00 PM
>> *To:* Kaylor, Andrew <andrew.kaylor at intel.com>
>> *Cc:* LLVM Developers Mailing List <llvm-dev at lists.llvm.org>
>> *Subject:* Re: [llvm-dev] RFC: token arguments and operand bundles
>>
>>
>>
>> I understand that, but I think you missed my point. Not all front-ends
>> are clang, and non-C frontends are also interested in this work. And even C
>> might want to eventually want to be able to use these more generally. For
>> example, the current C standard (afaik) doesn’t define what must happen if
>> this pragma tried to use a non-literal constant, such as a template
>> attribute as the arguments. But it’s not obvious to me why LLVM should
>> inherit that limitation. Currently it seems to be implemented in a way that
>> requires special handling in any front-end, influenced strong by the
>> special handling it’s now getting in C. For other languages, it’s doable to
>> expose this to users regardless, but if you’re already considering changing
>> it, my vote would be to use a normal representation with first-class values.
>>
>>
>>
>> However, I  really appreciate the specifics on the concern you brought
>> up, because that’s a good point. If it’s just about better IR
>> printing, perhaps we can just address that directly?
>>
>>
>>
>> Most simply, perhaps these calls could customize the printing to append a
>> comment? Some places already do that, for example to show Function
>> Attributes.
>>
>>
>>
>> Similarly, but more major, LLVM could perhaps define a new “named
>> constant” syntax for the parser format (either with special tokens like
>> your current PR and/or that get defined elsewhere like existing global
>> constants). Certain instructions (such as these) could then use the option
>> to customize the printing of their arguments to use the named constant
>> (after parsing, they’d just be a normal Constant—only printing would
>> optionally use them to show the information better to the reader).
>>
>>
>>
>>
>>
>> Thu, Nov 14, 2019 at 15:58 Kaylor, Andrew <andrew.kaylor at intel.com>
>> wrote:
>>
>> Let me clarify. These aren’t intended to be exposed to the user. The user
>> code that leads to the generation of these intrinsics will be normal
>> floating point operations combined with either pragmas (such as “STDC
>> FENV_ACCESS ON”) or command line options (such as the recently introduced
>> “-fp-model=strict”).
>>
>> The reason I’ve been avoiding normal constant values is that it provides
>> no information when you’re reading the IR. For example:
>>
>> *%sum = call double @llvm*.experimental.constrained.fadd(double %x,
>> double %y, i32 1, i32 2)
>>
>> What does that mean? You’d need to consult an external reference to have
>> any idea.
>>
>>
>>
>> -Andy
>>
>>
>>
>> *From:* Jameson Nash <vtjnash at gmail.com>
>> *Sent:* Thursday, November 14, 2019 12:27 PM
>> *To:* Kaylor, Andrew <andrew.kaylor at intel.com>
>> *Cc:* LLVM Developers Mailing List <llvm-dev at lists.llvm.org>
>> *Subject:* Re: [llvm-dev] RFC: token arguments and operand bundles
>>
>>
>>
>> From a front-end perspective, I think it'd be preferable if these either
>> got encoded in the function name or were normal enum value arguments. It's
>> a bit awkward to expose things to the user that must be constant or of a
>> special type or in a special metadata slot, since we now need more special
>> support for it. If the optimization passes couldn't identify a constant
>> value for one of the arguments, these seem like they can fallback to
>> assuming the most conservative semantics (of round.dynamic and
>> fpexcept.strict--e.g. don't optimize) without loss of precision or
>> generality.
>>
>>
>>
>> -Jameson
>>
>>
>>
>> On Thu, Nov 14, 2019 at 2:40 PM Kaylor, Andrew via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>> Hello everyone,
>>
>>
>>
>> I’ve just uploaded a patch (https://reviews.llvm.org/D70261) to
>> introduce a could of new token types to be used with constrained floating
>> point intrinsics and, optionally, vector predicated intrinsics. These
>> intrinsics may not be of interest to many of you, but I have a more general
>> question.
>>
>>
>>
>> I would like some general feedback on the way I am proposing to use token
>> arguments and operand bundles. I have an incomplete understanding of how
>> these are intended to be used, and I want to make sure what I have in mind
>> is consistent with the philosophy behind them.
>>
>>
>>
>> Currently, the constrained floating point intrinsics require string
>> metadata arguments to describe the rounding mode and exception semantics.
>> These “arguments” are really providing information to the optimizer about
>> what it can and cannot assume when acting on these intrinsics. The rounding
>> mode argument potentially overrides the default optimizer assumption that
>> the “to nearest” rounding mode is in use, and the exception behavior
>> argument overrides the default optimizer assumption that floating point
>> operations have no side effects. I’ve never liked the use of strings here,
>> and the fact that these arguments are not actually inputs to the operation
>> represented by the intrinsic seems vaguely wrong.
>>
>>
>>
>> A typical call to a current intrinsic looks like this:
>>
>>
>>
>> *%sum = call double @llvm*.experimental.constrained.fadd(double %x,
>>
>>                                                        double %y,
>>
>>                                                        Metadata
>> “fpround.dynamic”,
>>
>>                                                        Metadata
>> “fpexcept.strict”)
>>
>>
>>
>> The idea I am pursuing in my patch is to replace these metadata arguments
>> with optional operand bundles, “fpround” and “fpexcept”. If the operand
>> bundles are present, they would mean what the arguments currently mean. If
>> not, the default assumption is allowed. A typical call to a constrained
>> intrinsic would look like this:
>>
>>
>>
>> *%sum = call double @llvm*.experimental2.constrained.fadd(double %x,
>>
>>                                                         double %y) [
>> “fpround”(token rmDynamic),
>>
>>
>>                     “fpexcept”(token ebStrict) ]
>>
>>
>>
>> Does that seem like a valid use of tokens and operand bundles? Does it
>> seem better than the current approach?
>>
>>
>>
>> Thanks,
>>
>> Andy
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
> Click here
> <https://www.mailcontrol.com/sr/w6lmiTpFFALGX2PQPOmvUvSoGVaHHSWCN1fIRfFE__Q2Jd3LbOYBaXnHmENzLcAcH1tKAKIhCcWK0Rphlb7DTA==>
> to report this email as spam.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191118/e8b950a9/attachment-0001.html>