[llvm-dev] Fixed Point Support in LLVM

Wed Aug 22 02:32:49 PDT 2018

> On Aug 22, 2018, at 4:38 AM, Bevin Hansson <bevin.hansson at ericsson.com> wrote:
> 
> 
> 
> On 2018-08-22 05:56, John McCall via llvm-dev wrote:
>>> On Aug 21, 2018, at 6:20 PM, Leonard Chan <leonardchan at google.com> wrote:
>>> If we were to create a new type down the line, I think the main
>>> features that would distinguish them from other types are the
>>> arbitrary width and scale. Saturation can be handled through
>>> instructions since saturation really only takes effect after an
>>> operation and doesn’t really describe anything about the bits in the
>>> resulting type. Signage can similarly be managed through operations
>>> and would be consistent with the separation in signed and unsigned int
>>> operations.
>>> 
>>> The unsigned padding is a result of the data bits not taking the whole
>>> width of the underlying llvm integer in the frontend for optimization
>>> purposes, so (I don’t think) this would be a problem if fixed points
>>> were represented as a native type of arbitrary width and scale
>>> (similar to how llvm represents arbitrary width integers like i33).
>> I agree with all of this, with the caveat that we do have to make a decision
>> about the definedness of the padding bit.  But I think it's okay to just assume
>> that all targets will follow whatever decision we make; if someone blinks and
>> wants the opposite rule, they get to go add an extra argument to all the intrinsics.
> Personally I would prefer to see the bit defined as zero, as that is how our implementation works. But I don't know the precise implications of leaving it undefined other than that overflow on unsigned fixed-point numbers could produce undefined results.

As I understand things, leaving it undefined would allow non-saturating addition/subtraction to just freely overflow into the bit.  I don't know if other arithmetic would equally benefit, and defining it to be zero will probably simplify most other operations even if it does sometimes require some amount of extra masking.

I don't know what the dynamic balance of fixed-point operations is in a real program.  It's certainly plausible that optimizing non-saturating arithmetic is worth penalizing other operations.

>>> I’m unsure if I should stop what I’m working on now though to
>>> implement this type. Although it seems correct, there also doesn’t
>>> seem to be a very high demand for a new llvm type. I imagine another
>>> reason one would add a new type, in addition to your reasons, is that
>>> it represents a common type that could be used in multiple frontends,
>>> but it doesn’t seem like other llvm frontends actively demand this.
>>> For now I imagine intrinsics would be a nice middle ground and down
>>> the line once fixed point types are fully fleshed out, we can explore
>>> adding a new llvm type.
>> It's fine to start with intrinsics, I think.  I do think it would be better to add a new
>> type, but I don't want to derail your project with a political argument over it.
>> 
>> I *will* derail your project if necessary with an argument that these ought to be
>> portable intrinsics, though. :)
> Can you clarify what you mean by portable? Portable from a target standpoint or from a language standpoint, or both?

I just mean that I want there to be intrinsics called llvm.fixadd or llvm.fixmul or whatever with well-specified, target-independent semantics instead of a million target-specific intrinsics called something like llvm.supermips37.addfixvlq_16.

> Either of these goals could be pretty tricky if the semantics of the intrinsics must be well defined ("fixsmul is equivalent to (trunc (lshr (mul (sext a), (sext b))))") rather than "fixsmul does a signed fixed-point multiplication". If a target or language has different semantics for their fixed-point operations, the intrinsics are useless to them.

> I would rather see them well defined than not, though. I also agree that they should be portable and generic enough to support any language/target implementation, but unless you add lots of intrinsics and parameterization, this could result in a bit of 'mismatch' between what the intrinsics can do and what the frontend wants to do. At some point you might end up having to emit a bit of extra code in the frontend to cover for the deficiencies of the generic implementation.

"Lots of parameterization" sounds about right.  There should just be a pass in the backend that legalizes intrinsics that aren't directly supported by the target.  The analogy is to something like llvm.sadd_with_overflow: a frontend can use that intrinsic on i19 if it wants, and that obviously won't map directly to a single instruction, so LLVM legalizes it to operations that *are* directly supported.  If your target has direct support for saturating signed additions on a specific format, the legalization pass can let those through, but otherwise it should lower them into basic operations.

If the frontend generates a ton of requests for operations that have to be carried out completely in software, that's its problem.

>> As for other frontends, I can only speak for Swift.  Fixed-point types are not a high
>> priority for Swift, just like they haven't been a high priority for Clang — it's not like
>> Embedded C is a brand-new specification.  But if we had a reason to add them to
>> Swift, I would be pretty upset as a frontend author to discover that LLVM's support
>> was scattered and target-specific and that my best implementation option was to
>> copy a ton of code from Clang.
> It might not be a ton, but at some level you'd have to copy a bit of code. There's several fixed-point operations that probably don't deserve their own intrinsics, like nonsaturating fixed-fixed and fixed-int conversion.

Why not?  I mean, sure, they're easy to define with extends and shifts, but it doesn't really seem *bad* to have intrinsics for them.  I guess you'd lose some small amount of free integer optimization from the middle-end, but the important cases will probably all still get done for free when they get legalized.

> There's always the possibility of adding them to IRBuilder if we think they might need to be reused.

Please at least do this, yes.

>> The main downsides of not having a type are:
>> 
>>   - Every operation that would've been overloaded by operand type instead has
>>     to be parameterized.  That is, your intrinsics all have to take width and scale in
>>     addition to signed-ness and saturating-ness; that's a lot of parameters, which
>>     tends to make testing and debugging harder.
> The width would be implied by the width of the integer type, and signedness and saturation should simply have their own intrinsics than be a parameter. Scale would have to be a constant parameter for some of the intrinsics, though.

Well, width can differ from the width of the integer type for these padded types, right?  I mean, you can represent those differently if you want, but I would hope it's represented explicitly and not just as a target difference.

> It means more intrinsics, but I think it's a better design than having a single intrinsic with flags for different cases.

In my experience, using different intrinsics does make it awkward to do a lot of things that would otherwise have parallel structure.  It's particularly annoying when generating code, which also affects middle-end optimizers.  And in the end you're going to be pattern-matching the constant arguments anyway.

But ultimately, I'm not trying to dictate the design to the people actually doing the work.  I just wanted to ward off the implementation that seemed to be evolving, which was a bunch of hand-lowering in Clang's IR-generation, modified by target hooks to emit target-specific intrinsic calls.

>>   - Constants have to be written as decimal integers, which tends to make testing
>>     and debugging harder.
> It would be possible to add a decimal fixed-point format to the possible integer constant representations in textual IR, but this doesn't help when you're printing.

Right.

>>   - Targets that want to pass fixed-point values differently from integers have to
>>     invent some extra way of specifying that a value is fixed-point.
> Another function attribute would probably be fine for that.

It depends.  It wouldn't work well with compound results, at least, but people often tend not to care about those because they're awkward to work with in C.

John.