[llvm-dev] Integer -> Floating point -> Integer cast optimizations
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Fri Apr 15 15:23:53 PDT 2016
Patches should generally go to llvm-commits and will need test cases. I
didn't glance at this in detail, but the general approach seems reasonable.
On 04/15/2016 10:57 AM, Carlos Liam via llvm-dev wrote:
> Is this patch sound? https://ghostbin.com/paste/8wt63
> I don't think it is getting triggered.
>
> - CL
>
>> On Apr 15, 2016, at 8:53 AM, Carlos Liam <carlos at aarzee.me
>> <mailto:carlos at aarzee.me>> wrote:
>>
>> My understanding is that this checks whether the bit width of the
>> integer *type* fits in the bit width of the mantissa, not the bit
>> width of the integer value.
>>
>> - CL
>>
>>> On Apr 14, 2016, at 6:02 PM, escha at apple.com
>>> <mailto:escha at apple.com> wrote:
>>>
>>> We already do this to some extent; see this code in InstCombineCasts:
>>>
>>> // fpto{s/u}i({u/s}itofp(X)) --> X or zext(X) or sext(X) or trunc(X)
>>> // This is safe if the intermediate type has enough bits in its
>>> mantissa to
>>> // accurately represent all values of X. For example, this won't
>>> work with
>>> // i64 -> float -> i64.
>>> Instruction*InstCombiner::FoldItoFPtoI(Instruction&FI) {
>>>
>>> —escha
>>>
>>>> On Apr 14, 2016, at 2:29 PM, Carlos Liam via llvm-dev
>>>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>>>
>>>> I'm saying at the IR level, not the C level. IR makes certain
>>>> assumptions about the representation of floating point numbers.
>>>> Nothing to do with C, I only used it as an example.
>>>>
>>>> - CL
>>>>
>>>>> On Apr 14, 2016, at 4:49 PM, Martin J. O'Riordan
>>>>> <martin.oriordan at movidius.com
>>>>> <mailto:martin.oriordan at movidius.com>> wrote:
>>>>>
>>>>> I don't think that this is correct.
>>>>>
>>>>> | Let's say we have an int x, and we cast it to a float and back.
>>>>> Floats have 8 exponent bits and 23 mantissa bits.
>>>>>
>>>>> 'float', 'double' and 'long double' do not have specific
>>>>> representations, and a given implementation might choose different
>>>>> FP implementations for each.
>>>>>
>>>>> ISO C and C++ only guarantee that 'long double' can accurately
>>>>> represent all values that may be represented by 'double', and that
>>>>> 'double' can represent accurately all values that may be
>>>>> represented by 'float'; but it does not state that 'float' has 8
>>>>> bits of exponent and 23-bits of mantissa.
>>>>>
>>>>> And this is a particular problem I often face when porting
>>>>> floating-point code between platforms, each of which can genuinely
>>>>> claim to be ISO C compliant.
>>>>>
>>>>> It is "common" for 'float' to be IEEE 754 32-bit Single Precision
>>>>> compliant.
>>>>> It is also "common" for 'double' to be IEEE 754 64-bit Double
>>>>> Precision compliant.
>>>>>
>>>>> But "common" does not mean "standard". The 'clang' optimisations
>>>>> have to adhere to the ISO C/C++ Standards, and not what might be
>>>>> perceived as "the norm". Floating-Point has for a very long time
>>>>> been a problem.
>>>>>
>>>>> o How does the machine resolve FP arithmetic?
>>>>> o How does the compiler perform FP arithmetic - is it the same as
>>>>> the target machine or different?
>>>>> o How does the pre-processor evaluate FP arithmetic - is it the
>>>>> same as the target machine or different?
>>>>>
>>>>> These have been issues since the very first ISO C standard (ANSI
>>>>> C'89/ISO C'90) and before. Very simple things like:
>>>>>
>>>>> #define MY_FP_VAL (3.14159 / 2.0)
>>>>>
>>>>> Where is that divide performed? In that compiler subject to host
>>>>> FP rules? In the compiler subject to target rules? Executed
>>>>> dynamically by the host? The same problem occurs when performing
>>>>> constant folding in the compiler, should it follow a model that is
>>>>> different to what the target would do or not? Worse still, when
>>>>> the pre-processor, compiler, and target are each different machines.
>>>>>
>>>>> These are huge problems in the FP world where exact equivalence
>>>>> and ordering of evaluation really matters (think partial ordering
>>>>> - not the happy unsaturated INT modulo 2^N world).
>>>>>
>>>>> On our architecture, we have chosen the 32-bit IEEE model provided
>>>>> by 'clang' for 'float' and 'double', but we have chosen the 64-bit
>>>>> IEEE model for 'long double'; other implementations are free to
>>>>> choose a different model. We also use IEEE 16-bit FP for 'half'
>>>>> aka '__fp16'. But IEEE also provides for 128-bit FP, 256-bit FP,
>>>>> and there are FP implementations that use 80-bits. In fact,
>>>>> 'clang' does not preclude an implementation choosing IEEE 754
>>>>> 16-bit Half-Precision as its representation for 'float'. This
>>>>> means 5-bits of exponent and 10-bits of mantissa - and that is
>>>>> still ISO C compliant.
>>>>>
>>>>> Any target is free to choose the FP representation it prefers for
>>>>> 'float', and that does not mean that it is bound to IEEE 754
>>>>> 32-bit Single Precision Floating-Point. Any FP optimisations
>>>>> within the compiler need to keep that target clearly in mind; I
>>>>> know, I've been burned by this before.
>>>>>
>>>>> MartinO
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf
>>>>> Of Carlos Liam via llvm-dev
>>>>> Sent: 14 April 2016 19:14
>>>>> To: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>>>> Subject: [llvm-dev] Integer -> Floating point -> Integer cast
>>>>> optimizations
>>>>>
>>>>> I brought this up in IRC and was told to consult someone who knows
>>>>> more about floating point numbers; I propose an optimization as
>>>>> follows.
>>>>>
>>>>> Let's say we have an int x, and we cast it to a float and back.
>>>>> Floats have 8 exponent bits and 23 mantissa bits.
>>>>>
>>>>> If x matches the condition `countTrailingZeros(abs(x)) >
>>>>> (log2(abs(x)) - 23)`, then we can remove the float casts.
>>>>>
>>>>> So, if we can establish that abs(x) is <= 2**23, we can remove the
>>>>> casts. LLVM does not currently perform that optimization on this C
>>>>> code:
>>>>>
>>>>> int floatcast(int x) {
>>>>> if (abs(x) <= 16777216) { // abs(x) is definitely <= 2**23 and
>>>>> fits into our mantissa cleanly
>>>>> float flt = (float)x;
>>>>> return (int)flt;
>>>>> }
>>>>> return x;
>>>>> }
>>>>>
>>>>> Things get more interesting when you bring in higher integers and
>>>>> leading zeros. Floating point can't exactly represent integers
>>>>> that don't fit neatly into the mantissa; they have to round to a
>>>>> multiple of some power of 2. For example, integers between 2**23
>>>>> and 2**24 round to a multiple of 2**1 - meaning that the result
>>>>> has *at least* 1 trailing zero. Integers between 2**24 and 2**25
>>>>> round to a multiple of 2**2 - with the result having at least 2
>>>>> trailing zeros. Et cetera. If we can prove that the input to these
>>>>> casts fits in between one of those ranges *and* has at least the
>>>>> correct number of leading zeros, we can eliminate the casts. LLVM
>>>>> does not currently perform this optimization on this C code:
>>>>>
>>>>> int floatcast(int x) {
>>>>> if (16777217 <= abs(x) && abs(x) <= 33554432) { // abs(x) is
>>>>> definitely between 2**23 and 2**24
>>>>> float flt = (float)(x / abs(x) * (abs(x) & (UINT32_MAX ^
>>>>> 2))); // what's being casted to float definitely has at least one
>>>>> trailing zero in its absolute value
>>>>> return (int)flt;
>>>>> }
>>>>> return x;
>>>>> }
>>>>>
>>>>>
>>>>> - CL
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160415/c4925fe8/attachment-0001.html>
More information about the llvm-dev
mailing list