[PATCH] D11438: Fix x86_64 fp128 calling convention

Xinliang David Li via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 1 16:09:45 PST 2015


ok -- sounds good. Please submit patch-1 and we will do a final look.

David


On Tue, Dec 1, 2015 at 4:06 PM, Chih-hung Hsieh <chh at google.com> wrote:

> David,
>
> I tried again to do without ReplaceValueWith in SoftenFloatResult for
> fp128,
> and with cloning of nodes in SoftenFloatOperand. I found more problems than
> my previous FCOPYSIGN example.
>
> For the CopyToReg opcode, I found multiple formats and
> there could be multiple "result values". The simple call to
>    ReplaceValueWith(SDValue(N, 0), Res);
> at the end of current SoftenFloatOperand won't work.
> I will need to call
>    ReplaceValueWith(SDValue(N, i), SDValue(NewNode, i));
> for every result i.
> So that added not only more code into SoftenFloatOp_*, but also
> more complicated SoftenFloatOperand.
>
> Another more difficult pattern is
>     Soften float result 0: t4: f128 = fmul t2, ConstantFP:f128
> followed by
>     Expand integer result: t5: i128 = bitcast t4
>
> If SoftenFloatResult does not call ReplaceValueWith for t4,
> ExpandIntegerResult will need to "replace" its operand with softened t4
> or clone the whole BITCASE node. That will require more complicated
> changes in ExpandRes_BITCAST(N, Lo, Hi) and ExpandIntegerResult.
> Right now, in my patch I have only a simple condition inside
> ExpandRes_BITCAST
> to do nothing for fp128, because we can keep f128 and i128 in the same
> register.
>
> So, without ReplaceValueWith inside SoftenFloatResult, we not only need to
> clone nodes in SoftenFloatOperands, but also in other kinds of "legalize"
> functions like ExpandIntegerResult.
>
> Comparing the two approaches:
> (1) calling ReplaceValueWith of a node to save cloning of all user nodes
> like I have now, or
> (2) not calling ReplaceValueWith of a node but clone all user nodes when
> that node is scanned as an operand.
> I saw both patterns in llvm, but (1) seemed to be more efficient.
> I think the <valid>Op_* functions exist because some operand nodes are not
> transformed
> to other "legal" type. Right now most "illegal" type node are transformed
> to legal type,
> but that won't be true for new x86_64 f128 type. If we use approach (2),
> we will need to add many new <valid>Op_* functions and even change the
> ExpandIntegerResult function.
> To me, it is lower risk to depend on fewer <valid>Op_* functions to handle
> f128/i128 values in register, than to depend on more <valid>Op_* and
> <valid>Res_* functions to clone the new soften-yet-still-f128-type operands.
>
> Beside this issue, do you see other area to change?
>
> I have tried the part 1 subset mentioned in my previous comment.
> That passed all current llvm unit tests and include the following files.
>     include/llvm/Target/TargetLowering.h
>     include/llvm/Target/TargetRegisterInfo.h
>     lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>     lib/CodeGen/SelectionDAG/InstrEmitter.cpp
>     lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
>     lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
>     lib/CodeGen/SelectionDAG/LegalizeTypes.cpp
>     lib/CodeGen/SelectionDAG/LegalizeTypes.h
>     lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
>     lib/CodeGen/SelectionDAG/SelectionDAG.cpp
>     lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
>     lib/CodeGen/SelectionDAG/TargetLowering.cpp
>     lib/CodeGen/TargetLoweringBase.cpp
>     lib/CodeGen/TargetRegisterInfo.cpp
>     lib/Target/X86/X86MCInstLower.cpp
>
> If there is no other change, I will upload that part 1 change in a new
> patch.
> Thanks.
>
>
> On Wed, Nov 25, 2015 at 4:25 PM, Xinliang David Li <davidxl at google.com>
> wrote:
>
>>
>>
>> On Tue, Nov 24, 2015 at 5:22 PM, Chih-Hung Hsieh <chh at google.com> wrote:
>>
>>> chh added a comment.
>>>
>>> David,
>>>
>>> I will upload a new patch with some of your older suggested changes
>>> first.
>>> Please see details in the new code diff and upload comment.
>>>
>>> I met many difficulties implementing your latest inline comments
>>> in LegalizeFloatTypes.cpp, so there is no change with that idea yet.
>>> Please see my reply to the inline comment.
>>>
>>> For separating this large patch, I have not tried yet,
>>> but it might be okay to split at least into two parts:
>>> (2) The new unit test files *.ll and X86CallingConv.td,
>>> X86ISelLowering.cpp, X86InstrInfo.td, X86InstrSSE.td, X86RegisterInfo.td.
>>> (1) All the other files.
>>>
>>> We should be able to check in (1) and expect no change to all targets.
>>> Patch (2) will fix Android x86_64 problem, but still no change to other
>>> targets.
>>> I am assuming that we want some time to test (1) before submitting (2).
>>> Further split of (1) might be possible, but that would take more review
>>> and test cycles if there is little regression. When there is any
>>> regression,
>>> it should be easy to revert all changes in (1) and then debug-and-retry,
>>> right?
>>> I don't see a way to split (2), as we don't want to have the Android
>>> target
>>> in a new incomplete state different from the current one.
>>>
>>>
>>> ================
>>> Comment at: lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp:158
>>> @@ +157,3 @@
>>> +    // When LegalInHWReg is true, we take a short-cut in
>>> SoftenFloatOperand
>>> +    // to save repeated computations and make the code simpler there.
>>> +    // So here we need to call ReplaceValueWith.
>>> ----------------
>>> davidxl wrote:
>>> > I don't quite like the short cut here either.  The original
>>> architected flow is reasonable to me: basically
>>> > 1) After softenfloat result, the old SDValue is mapped to the new
>>> result
>>> > 2) the worklist based loop follow DU and use instrctions of the old
>>> result get processed -- operands are scanned.
>>> > 3) during scanning, the use of the old value will be replaced with the
>>> new value.
>>> > 4) if scan operands result in a new node, replaceValueWith then needs
>>> to be eagerly called here on demand.
>>> >
>>> > I prefer you do not change the flow in SoftenFloatOperands and make a
>>> patch here to workaround it. The resulting change may look bigger (as more
>>> opcodes need to be handled), but it is what it is.
>>> >
>>> > I also suggest you extract the following patches out seperately
>>> > 1) softenFloatResult and softenFloatOperands
>>> > 2) f128 constantFP
>>> > 3) register class small refactoring
>>> > 4) the rest (may also be breakable )
>>> The difficulties I found in the current architecture are
>>> (1) Avoid duplicating a lot of code from SoftenFloatRes_* to
>>> SoftenFloatOp_*.
>>> (2) Make sure the changes in multiple SoftenFloatRes_* and
>>> SoftenFloatOp_* will be fixed correctly at the same time in the future.
>>>
>>> In SoftenFloatOperand, we can see that it calls ReplaceValueWith
>>> at the end for all kinds of opcode. The number of calls to
>>> ReplaceValueWith
>>> is not as many as we expected. Hence, it seems to me the norm is to share
>>> code at higher level than pushing down into SoftenFloatOp_* and
>>> SoftenFloatRes_*.
>>> It seems reasonable to call ReplaceValueWith at the end of
>>> SoftenFloatOperand and SoftenFloatResult.
>>>
>>
>>
>> You can't compare SoftenFloatOperand with SoftenFloatResult like this.
>> SoftenFloatResult is for the value producer (defs) -- there is no need
>> eagerly call ReplaceValueWith because their consumer nodes will naturally
>> follow and be processed (either as value producer or consumer with operand
>> scanned).
>>
>> SoftenFloatOperands on the other hand, by definition, does not enable new
>> consumer nodes to be ready in the worklist so it needs to  call
>> ReplaceValueWith when the node changes.
>>
>> Anyway, I think we are getting close to a closure. Can you first carve
>> out part-1 (as I proposed above) first (I assume it won't affect others).
>>
>> David
>>
>>
>>>
>>> I tried to remove this part, but the result was either
>>> (a) calling ReplaceValuewith inside SoftenFloatRes_*, or
>>> (b) copying many SoftenFloatRes_* code into SoftenFloatOp_* and then
>>> call ReplaceValueWith in SoftenFloatOp_*.  The copied code is not all
>>> simple and small.  For example, SoftenFloatRes_FCOPYSIGN is complicated,
>>> and I don't see how simpler it is to duplicate it into
>>> SoftenFloatOp_FCOPYSIGN, or abstract it out into a shared function.
>>>
>>> What you mentioned in another email earlier might be better to
>>> move some SoftenFloatOp_* responsibility into SoftenFloatRes_*.
>>> Then, we have less code duplication, and we need to make sure
>>> that ReplaceValueWith is called whenever something is changed
>>> by SoftenFloatRes_*. I could try that when I have more time.
>>>
>>> My concern (2) is more scaring. During my test of those code movement,
>>> I made a few mistakes and the compiler could run and generate inefficient
>>> code because some ReplaceValueWith call was missed or some fp128 was
>>> not kept in register due to a short-cut was missed. So I would prefer
>>> a way easier and smaller change to verify no no-impact to existing
>>> targets,
>>> by conditioned on isLegalInHWReg, even if some can be applied to other
>>> targets.
>>>
>>>
>>>
>>>
>>> http://reviews.llvm.org/D11438
>>>
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151201/8c869abd/attachment.html>


More information about the llvm-commits mailing list