[llvm-dev] Proposed new min and max intrinsics
Fabian Giesen via llvm-dev
llvm-dev at lists.llvm.org
Mon Nov 12 12:24:09 PST 2018
One more idea: (if this is getting spammy, tell me and I'll stop!)
If %a and %b are ordered equal, return %a | %b (gives -0.0 if either is
-0.0; for maximum, it would be %a & %b, only returning -0.0 if both are
-0.0). The only two distinct bit patterns that compare ordered equal
should be -0 and +0, which are identical everywhere except for the sign bit.
Gets a bit bit-casty but not that bad:
;; Work out result for when %a == %b (ordered)
%a_bits = bitcast float %a to i32
%b_bits = bitcast float %b to i32
%a_bitor_b_bits = or i32 %a_bits, %b_bits
%a_bitor_b = bitcast i32 %a_bitor_b_bits to float
;; Find minimum if not both zero (from above)
%no_nan_and_a_lesser = fcmp olt %a, %b
%lesser_or_b_if_nan = select %no_nan_and_a_lesser, %a, %b
%a_not_nan = fcmp ord %a, %a
%minimum_not_zeros = select %a_not_nan, %lesser_or_b_if_nan, %a
;; Handle +-0
%a_and_b_ordered_equal = fcmp oeq %a, %b
%final_minimum = select %a_and_b_ordered_equal, %a_bitor_b,
%minimum_not_zeros
For x86-64, we can do a bit better by combining the OR with the final
select, which I haven't tried to express in LLVM IR:
; input: %a in xmm0, %b in xmm1
movaps xmm2, xmm0
cmpeqss xmm2, xmm1 ; %a_or_b_ordered_equal
andps xmm2, xmm0 ; %a_or_b_ordered_equal & %a
movaps xmm3, xmm0
minss xmm3, xmm1 ; %lesser_or_b_if_nan
movaps xmm4, xmm0
cmpordss xmm0, xmm0 ; %a_not_nan
blendvps xmm3, xmm4 ; %minimum_not_zeros
; (blendvps turns into andps / andnps / orps pre-SSE4.1)
orps xmm2, xmm3 ; %final_minimum
(if %a_or_b_ordered_equal, then the result of %minimum_not_zeros is
guaranteed to be %b, hence we only need to conditionally OR in %a in
that case.)
For the maximum case, you would use "cmpneqss", "orps xmm2, xmm0", then
"andps xmm2, xmm3" for the final combine.
-Fabian
On 11/9/2018 2:22 PM, Fabian Giesen wrote:
> Ah, forget about the +-0 issue!
>
> Slightly different way to handle them: XOR the two arguments, AND to
> keep only the sign bit, then OR it back into the result at the end (for
> minimum; maximum would do a bit clear/and-not).
>
> The argument goes as follows:
> 1) If the two sign bits agree, this does nothing
> 2) If the two sign bits disagree:
> 2.1) If the comparison was unordered, we already return a NaN; this
> makes us return a NaN with sign bit set. (But still a NaN. I don't see
> anything in the rules for minimum/maximum guaranteeing a specific NaN.)
> 2.2) If the comparison was ordered and not between two 0s, the value
> with the sign bit set is the smaller one, so the OR ends up doing nothing.
> 2.3) If the comparison was ordered and between two 0s, the OR
> guarantees we return a negative 0.
>
> If you write this out with the required bitcasts it probably still gets
> lengthy in LLVM IR form (so might still want to put it in compiler-rt),
> but this seems like a preferable implementation provided the target lets
> you do the required bitwise ops on FP registers.
>
> -Fabian
>
> On 11/9/18 10:56 AM, Thomas Lively wrote:
>
>> I would agree, but the expansion also has to properly treat negative
>> zero as less than zero, which leads to something like the following:
>>
>> ;; Find minimum if both zero
>> %a_sign = fgetsign %a
>> %b_sign = fgetsign %b
>> %a_is_lesser_zero = icmp ugt %a_sign, %b_sign
>> %minimum_zeros = select %a_is_lesser_zero, %a, %b
>>
>> ;; Find minimum if not both zero (from above)
>> %no_nan_and_a_lesser = fcmp olt %a, %b
>> %lesser_or_b_if_nan = select %no_nan_and_a_lesser, %a, %b
>> %a_not_nan = fcmp ord %a, %a
>> %minimum_not_zeros = select %a_not_nan, %lesser_or_b_if_nan, %a
>>
>> ;; Choose between zeros and not-zeros
>> %a_is_zero = fcmp oeq %a, 0
>> %b_is_zero = fcmp oeq %b, 0
>> %both_zero = and %a_is_zero, %b_is_zero
>> %minimum = select %both_zero, %minimum_zeros, %minimum_not_zeros
>>
>> Which is considerably less reasonable.
>>
>>
>>
>> On Fri, Nov 9, 2018 at 7:00 AM Cameron McInally via llvm-dev
>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>> On Thu, Nov 8, 2018 at 11:35 PM Fabian Giesen via llvm-dev
>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>> What is so complicated about these? Shouldn't they just
>> correspond to
>> two compares + selects?
>>
>> To give a concrete example, x86 MIN[SP][SD] and MAX[SP][SD],
>> respectively, correspond exactly to
>>
>> MIN*: select(a < b, a, b) (i.e. "a < b ? a : b")
>> MAX*: select(a > b, a, b) (i.e. "a > b ? a : b")
>>
>> IIRC, MINIMUM and MAXIMUM have the added requirement that they
>> should
>> return NaN if _either_ input is NaN, whereas the above will
>> return NaN
>> if the second input (i.e. b) is NaN, but not if the first is.
>>
>> So we need to explicitly catch the case where a is NaN as
>> well. For
>> minimum, that works out to something like:
>>
>> %3 = fcmp olt float %a, %b
>> %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
>> %5 = fcmp ord float %a, %a ; true if !isNaN(a)
>> %6 = select i1 %5, float %4, float %a ; if a was NaN, return a
>>
>> for the entire operation. The logic here is that if isNaN(a) ||
>> isNaN(b), the initial comparison will evaluate to false and %4
>> ends up
>> being b. If isNaN(b), this is a NaN value (as required). The
>> case we are
>> missing is when isNaN(a) && !isNaN(b), where %4 is not a NaN;
>> the second
>> compare + select fixes that one up.
>>
>> The first pair of these corresponds to a single (x86-style)
>> MIN/MAX, and
>> the second turns into a compare followed by (depending on target
>> instruction set) either a BLEND or some logic ops.
>>
>> For minimumNumber/maximumNumber, you should be able to use a
>> similar
>> construction. Showing the example for minimumNumber here:
>>
>> %3 = fcmp olt float %a, %b
>> %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
>> %5 = fcmp ord float %b, %b ; true if !isNaN(b)
>> %6 = select i1 %5, float %4, float %a ; if b was NaN, return a
>>
>> Starts out the same as before. Here, the tricky case is
>> !isNaN(a) &&
>> isNaN(b). The initial select %4 will result in the (NaN) b in
>> that case,
>> and the second compare/select pair switches the result to a
>> instead; we
>> will only get a NaN result if both inputs were NaN.
>>
>> I might be missing something here, but that seems like a
>> fairly harmless
>> expansion, as such things go, and going to compiler-rt feels
>> like overkill.
>>
>>
>> +1
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
More information about the llvm-dev
mailing list