[PATCH] [Multiprecision Arithmetic Optimization] Teach SelectionDAG how to promote the integer carry result of {add, sub}{c, e} to a larger integer type.
Michael Gottesman
mgottesman at apple.com
Wed May 22 14:29:22 PDT 2013
On May 22, 2013, at 12:33 AM, Duncan Sands <duncan.sands at gmail.com> wrote:
> Hi Michael,
>
> On 21/05/13 23:53, Michael Gottesman wrote:
>> Hey Duncan. I am going to respond in a backwards order since I think it will
>> yield additional clarity [if it does not, please tell me and I will elaborate = )]:
>>
>>> Actually, isn't ADDC really exactly the same as UADDO? If so, how about
>>> eliminating the ADDC in favour of UADDO? In any case, there's already
>>> promotion logic for UADDO, DAGTypeLegalizer::PromoteIntRes_UADDSUBO, so
>>> maybe you could take a look.
>
> actually I should probably have said: eliminate UADDO, replacing it with ADDC
> everywhere.
>
This seems out of scope for this patch since this patch is a small feature addition patch. Changing UADDO -> ADDC everywhere will definitely require changing a bunch of code in the X86 backend at least if not more backends. If we want to do this, we should file a PR (just due to the magnitude of the work).
>>
>> While ADDC/UADDO seem similar they are meant to model two different things.
>>
>> UADDO is meant to model the situation where one performs an addition and wishes
>> to detect overflow as a boolean value and perform some non-arithmetic operation,
>
> How do you know it will be used for non-arithmetic operations?
I was basing my assertion on the prior art in the X86 backend.
>
>> not perform additional arithmetic with said overflow value. Thus it makes sense
>> when we are promoting UADDO into an add + setne operation since UADDO is lowered
>> into similar code anyways (see LowerXALUO X86ISelLowering.cpp) meaning no
>> performance hit.
>>
>> On the other hand ADDC is meant to model the situation where one wishes to
>> perform additional arithmetic with the overflow value (specifically in a
>> multiprecision arithmetic context), i.e. on intel:
>>
>> addq (%rbx), %rax
>> adcq 8(%rbx), %rcx
>> adcq 16(%rbx), %ddx
>> ...
>>
>> If we were to lower ADDC like UADDO, we would instead have something along the
>> lines of the following non-optimal codegen:
>>
>> xorq %rax, %rax
>> addq (%rbx), %rcx
>> setne %al
>> movq %rax, %r8
>> addq 8(%rbx), %r9
>> getne %al
>> addq %r9, %r8
>> ...
>>
>> Thus the necessity of this distinction. X86 definitely follows said distinction
>> if you look at the lowering.
>
> I disagree. The ADDC and UADDO SDAG nodes have exactly the same definition.
> IMO, the fact that UADDO codegen is poor is a codegen weakness that should be
> improved, not a reason to keep two nodes, unless there really is no other way.
Again, I feel that changing UADDO -> ADDC to be out of scope for this patch since the patch is really only about handling the lowering of ADDC, not changing a bunch of code to change all usages of UADDO -> ADDC everywhere.
>
>>
>> Which leads me to your first paragraph.
>>
>>> thanks for adding the ResNo == 0 case, but I don't think it is correct.
>>> That's because you continue to return an ADDC code, yet because you zero
>>> extended the add operands there is no reason why the extended operands
>>> should overflow (set the carry flag) exactly when the original operation
>>> would.
>>
>> This is not completely true. It depends on how you do it. Specifically it
>> depends on how one legalizes the source operations of the addc.
>
> No it doesn't, this is not how type legalization works. Upper bits of promoted
> integers contain *undefined bits*. If they happen to be zext'd then you just
> got lucky, you can't rely on it. That's why if you *require* the extra bits to
> be zero-extended you need to use ZExtPromotedInteger, which explicitly zeros out
> the extra bits (the optimizers are often able to get rid of the explicit zeroing
> out if instead they can use an incoming node for which they can prove that the
> bits are zero already).
>
> Anyway, this is something of a digression, since promotion *doesn't change the
> meaning of what you are calculating*. Consider the following operation:
>
> (i8 x, i1 flag) = addc i8 255, 2
>
> Then you get:
>
> x = i8 1
> flag = i1 1
>
> On a platform where i8 and i1 have to be promoted to 32 bits, this *must* get
> legalized to:
>
> x = a 32 bit integer, where the 8 lower bits are 00000001 (aka i8 1), and the
> rest can be anything (it doesn't matter what they are, you can choose something
> convenient)
>
> flag = a 32 bit integer, where the lowest bit is 1, and the rest can be
> anything. (If promoting according to the so-called "boolean contents" then
> the extra bits may have additional requirements, such as being zero- or sign-
> extended from the lowest bit).
>
> Now consider what your logic does, supposing first you have to promote ResNo 0.
> Your logic forms
>
> (i32 x', i1 flag') = addc i32 255, 2
>
> So you get:
>
> x' = i32 257
> flag' = i1 0 (as this i32 operation does not set the carry flag)
>
> It is OK to use x' for the promoted value of x, since the 8 lowest bits are
> 00000001. But it is not OK to use flag' for flag, since the lowest bit is
> zero, not 1.
>
> In short, the problem is correctly calculating the flag (you have to check if
> bit 8 of x' was set), not calculating the sum. See the UADDO promotion logic
> for one way that calculating the flag can be done.
Ok. My point was that if we had certain guarantees than it *could* work. On the other hand as you have pointed out in this section of your response we do not have any of those guarantees. The end result is my digression was a waste of text and time = /.
Would zero extending just ResNo 0 be a satisfactory lowering (i.e. no inputs, just ResNo 0)?
Thanks for the brain dumping = ).
Michael
>
> For instance if
>> one is loading from an 8 bit array, zext to a 32 bit operation and then perform
>> the addc you would have the issue you are thinking of. In contrast, if you could
>> guarantee that the memory access was promoted to 32 bit (since addition of
>> multiprecision operations with different word sizes used yields the same
>> result), one would no longer have that issue. On the other hand, I do not think
>> I can assume this will be done and it is irrelevant to the discussion since we
>> are discussing PromotingIntegerResults = p.
>>
>> Given that we are talking about promoting integer results, perhaps the proper
>> way to do this is to just zero extend ResNo == 0 (i.e. not the operands, just
>> the result). What do you think Duncan?
>>
>> Michael
>>
>> P.S. I realized the reason why I returned SDValue was b/c I thought the integer
>> promotion was like DAGCombine where one returns SDValue if one is not doing
>> anything vs asserting in PromoteIntegerResult where it looks like you assert. = (.
>
> In type legalization, first the target gets to have a go at legalizing the
> operation. It can return SDValue() if it doesn't know how or doesn't want to.
> Then the generic logic (eg your ADDC promotion method) kicks in. If it can't
> legalize the value *then there is no fall-back*: no-one knows how to legalize
> the value, type legalization has failed. So someone has to assert somewhere!
> As returning SDValue() is used to mean "I legalized the type by completely
> zapping the node" there is no convenient way to signal to the machinery that
> you failed (i.e. having it assert for you), so you have to just assert yourself.
>
> Ciao, Duncan.
>
>>
>> On May 21, 2013, at 12:12 PM, Duncan Sands <duncan.sands at gmail.com
>> <mailto:duncan.sands at gmail.com>> wrote:
>>
>>> Hi Michael,
>>>
>>> On 17/05/13 23:33, Michael Gottesman wrote:
>>>> Hows this look Duncan/Sean?
>>>
>>>
>>> I didn't think about ADDE but at a glance it has the same flaw.
>>>
>>> Ciao, Duncan.
>>>
>>>>
>>>>
>>>>
>>>>
>>>> Michael
>>>>
>>>> On May 16, 2013, at 10:41 AM, Michael Gottesman <mgottesman at apple.com
>>>> <mailto:mgottesman at apple.com>
>>>> <mailto:mgottesman at apple.com>> wrote:
>>>>
>>>>> I am trying to be incremental and did not have a need for such lowering. On
>>>>> the other hand if you want I can implement that in this patch. Just state your
>>>>> preference = ).
>>>>>
>>>>> Michael
>>>>>
>>>>> On May 16, 2013, at 1:34 AM, Duncan Sands <duncan.sands at gmail.com
>>>>> <mailto:duncan.sands at gmail.com>
>>>>> <mailto:duncan.sands at gmail.com>> wrote:
>>>>>
>>>>>> Hi Michael,
>>>>>>
>>>>>> On 15/05/13 19:37, Michael Gottesman wrote:
>>>>>>> The attached patch teaches selection DAG how to promote the integer carry
>>>>>>> result of an {add,sub}{c,e} node to a larger integer type. It is a part of a
>>>>>>> larger set of patches that I will be committing over the next little bit to
>>>>>>> enable the optimization of uaddo/usubo chains.
>>>>>>>
>>>>>>> Please Review,
>>>>>>> Michael
>>>>>>
>>>>>> +SDValue DAGTypeLegalizer::PromoteIntRes_ADDSUBC(SDNode *N, unsigned ResNo) {
>>>>>> + if (ResNo == 1)
>>>>>> + return PromoteIntRes_Overflow(N);
>>>>>> + return SDValue();
>>>>>> +}
>>>>>> +
>>>>>> +SDValue DAGTypeLegalizer::PromoteIntRes_ADDSUBE(SDNode *N, unsigned ResNo) {
>>>>>> + if (ResNo == 1)
>>>>>> + return PromoteIntRes_Overflow(N);
>>>>>> + return SDValue();
>>>>>> +}
>>>>>>
>>>>>> why do you return SDValue() if ResNo != 1 ? Shouldn't you either assert or
>>>>>> promote the value?
>>>>>>
>>>>>> Ciao, Duncan.
>>>>>>
>>>>>> _______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at cs.uiuc.edu
>>>>>> <mailto:llvm-commits at cs.uiuc.edu><mailto:llvm-commits at cs.uiuc.edu>
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> <mailto:llvm-commits at cs.uiuc.edu><mailto:llvm-commits at cs.uiuc.edu>
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130522/234aa3ff/attachment.html>
More information about the llvm-commits
mailing list