[PATCH] [Multiprecision Arithmetic Optimization] Teach SelectionDAG how to promote the integer carry result of {add, sub}{c, e} to a larger integer type.

Wed May 22 00:33:17 PDT 2013

Hi Michael,

On 21/05/13 23:53, Michael Gottesman wrote:
> Hey Duncan. I am going to respond in a backwards order since I think it will
> yield additional clarity [if it does not, please tell me and I will elaborate = )]:
>
>> Actually, isn't ADDC really exactly the same as UADDO?  If so, how about
>> eliminating the ADDC in favour of UADDO?  In any case, there's already
>> promotion logic for UADDO, DAGTypeLegalizer::PromoteIntRes_UADDSUBO, so
>> maybe you could take a look.

actually I should probably have said: eliminate UADDO, replacing it with ADDC
everywhere.

>
> While ADDC/UADDO seem similar they are meant to model two different things.
>
> UADDO is meant to model the situation where one performs an addition and wishes
> to detect overflow as a boolean value and perform some non-arithmetic operation,

How do you know it will be used for non-arithmetic operations?

> not perform additional arithmetic with said overflow value. Thus it makes sense
> when we are promoting UADDO into an add + setne operation since UADDO is lowered
> into similar code anyways (see LowerXALUO X86ISelLowering.cpp) meaning no
> performance hit.
>
> On the other hand ADDC is meant to model the situation where one wishes to
> perform additional arithmetic with the overflow value (specifically in a
> multiprecision arithmetic context), i.e. on intel:
>
> addq (%rbx), %rax
> adcq 8(%rbx), %rcx
> adcq 16(%rbx), %ddx
> ...
>
> If we were to lower ADDC like UADDO, we would instead have something along the
> lines of the following non-optimal codegen:
>
> xorq %rax, %rax
> addq (%rbx), %rcx
> setne %al
> movq %rax, %r8
> addq 8(%rbx), %r9
> getne %al
> addq %r9, %r8
> ...
>
> Thus the necessity of this distinction. X86 definitely follows said distinction
> if you look at the lowering.

I disagree.  The ADDC and UADDO SDAG nodes have exactly the same definition.
IMO, the fact that UADDO codegen is poor is a codegen weakness that should be
improved, not a reason to keep two nodes, unless there really is no other way.

>
> Which leads me to your first paragraph.
>
>> thanks for adding the ResNo == 0 case, but I don't think it is correct.
>> That's because you continue to return an ADDC code, yet because you zero
>> extended the add operands there is no reason why the extended operands
>> should overflow (set the carry flag) exactly when the original operation
>> would.
>
> This is not completely true. It depends on how you do it. Specifically it
> depends on how one legalizes the source operations of the addc.

No it doesn't, this is not how type legalization works.  Upper bits of promoted
integers contain *undefined bits*.  If they happen to be zext'd then you just
got lucky, you can't rely on it.  That's why if you *require* the extra bits to
be zero-extended you need to use ZExtPromotedInteger, which explicitly zeros out
the extra bits (the optimizers are often able to get rid of the explicit zeroing
out if instead they can use an incoming node for which they can prove that the
bits are zero already).

Anyway, this is something of a digression, since promotion *doesn't change the
meaning of what you are calculating*.  Consider the following operation:

   (i8 x, i1 flag) = addc i8 255, 2

Then you get:

    x = i8 1
    flag = i1 1

On a platform where i8 and i1 have to be promoted to 32 bits, this *must* get
legalized to:

   x = a 32 bit integer, where the 8 lower bits are 00000001 (aka i8 1), and the
rest can be anything (it doesn't matter what they are, you can choose something
convenient)

   flag = a 32 bit integer, where the lowest bit is 1, and the rest can be
anything.  (If promoting according to the so-called "boolean contents" then
the extra bits may have additional requirements, such as being zero- or sign-
extended from the lowest bit).

Now consider what your logic does, supposing first you have to promote ResNo 0.
Your logic forms

   (i32 x', i1 flag') = addc i32 255, 2

So you get:

   x' = i32 257
   flag' = i1 0  (as this i32 operation does not set the carry flag)

It is OK to use x' for the promoted value of x, since the 8 lowest bits are
00000001.  But it is not OK to use flag' for flag, since the lowest bit is
zero, not 1.

In short, the problem is correctly calculating the flag (you have to check if
bit 8 of x' was set), not calculating the sum.  See the UADDO promotion logic
for one way that calculating the flag can be done.

  For instance if
> one is loading from an 8 bit array, zext to a 32 bit operation and then perform
> the addc you would have the issue you are thinking of. In contrast, if you could
> guarantee that the memory access was promoted to 32 bit (since addition of
> multiprecision operations with different word sizes used yields the same
> result), one would no longer have that issue. On the other hand, I do not think
> I can assume this will be done and it is irrelevant to the discussion since we
> are discussing PromotingIntegerResults = p.
>
> Given that we are talking about promoting integer results, perhaps the proper
> way to do this is to just zero extend ResNo == 0 (i.e. not the operands, just
> the result). What do you think Duncan?
>
> Michael
>
> P.S. I realized the reason why I returned SDValue was b/c I thought the integer
> promotion was like DAGCombine where one returns SDValue if one is not doing
> anything vs asserting in PromoteIntegerResult where it looks like you assert. = (.

In type legalization, first the target gets to have a go at legalizing the
operation.  It can return SDValue() if it doesn't know how or doesn't want to.
Then the generic logic (eg your ADDC promotion method) kicks in.  If it can't
legalize the value *then there is no fall-back*: no-one knows how to legalize
the value, type legalization has failed.  So someone has to assert somewhere!
As returning SDValue() is used to mean "I legalized the type by completely
zapping the node" there is no convenient way to signal to the machinery that
you failed (i.e. having it assert for you), so you have to just assert yourself.

Ciao, Duncan.

>
> On May 21, 2013, at 12:12 PM, Duncan Sands <duncan.sands at gmail.com
> <mailto:duncan.sands at gmail.com>> wrote:
>
>> Hi Michael,
>>
>> On 17/05/13 23:33, Michael Gottesman wrote:
>>> Hows this look Duncan/Sean?
>>
>>
>> I didn't think about ADDE but at a glance it has the same flaw.
>>
>> Ciao, Duncan.
>>
>>>
>>>
>>>
>>>
>>> Michael
>>>
>>> On May 16, 2013, at 10:41 AM, Michael Gottesman <mgottesman at apple.com
>>> <mailto:mgottesman at apple.com>
>>> <mailto:mgottesman at apple.com>> wrote:
>>>
>>>> I am trying to be incremental and did not have a need for such lowering. On
>>>> the other hand if you want I can implement that in this patch. Just state your
>>>> preference = ).
>>>>
>>>> Michael
>>>>
>>>> On May 16, 2013, at 1:34 AM, Duncan Sands <duncan.sands at gmail.com
>>>> <mailto:duncan.sands at gmail.com>
>>>> <mailto:duncan.sands at gmail.com>> wrote:
>>>>
>>>>> Hi Michael,
>>>>>
>>>>> On 15/05/13 19:37, Michael Gottesman wrote:
>>>>>> The attached patch teaches selection DAG how to promote the integer carry
>>>>>> result of an {add,sub}{c,e} node to a larger integer type. It is a part of a
>>>>>> larger set of patches that I will be committing over the next little bit to
>>>>>> enable the optimization of uaddo/usubo chains.
>>>>>>
>>>>>> Please Review,
>>>>>> Michael
>>>>>
>>>>> +SDValue DAGTypeLegalizer::PromoteIntRes_ADDSUBC(SDNode *N, unsigned ResNo) {
>>>>> +  if (ResNo == 1)
>>>>> +    return PromoteIntRes_Overflow(N);
>>>>> +  return SDValue();
>>>>> +}
>>>>> +
>>>>> +SDValue DAGTypeLegalizer::PromoteIntRes_ADDSUBE(SDNode *N, unsigned ResNo) {
>>>>> +  if (ResNo == 1)
>>>>> +    return PromoteIntRes_Overflow(N);
>>>>> +  return SDValue();
>>>>> +}
>>>>>
>>>>> why do you return SDValue() if ResNo != 1 ?  Shouldn't you either assert or
>>>>> promote the value?
>>>>>
>>>>> Ciao, Duncan.
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> <mailto:llvm-commits at cs.uiuc.edu><mailto:llvm-commits at cs.uiuc.edu>
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> <mailto:llvm-commits at cs.uiuc.edu><mailto:llvm-commits at cs.uiuc.edu>
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>