[PATCH] D11438: Fix x86_64 fp128 calling convention

Chih-hung Hsieh via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 1 16:06:47 PST 2015


David,

I tried again to do without ReplaceValueWith in SoftenFloatResult for fp128,
and with cloning of nodes in SoftenFloatOperand. I found more problems than
my previous FCOPYSIGN example.

For the CopyToReg opcode, I found multiple formats and
there could be multiple "result values". The simple call to
   ReplaceValueWith(SDValue(N, 0), Res);
at the end of current SoftenFloatOperand won't work.
I will need to call
   ReplaceValueWith(SDValue(N, i), SDValue(NewNode, i));
for every result i.
So that added not only more code into SoftenFloatOp_*, but also
more complicated SoftenFloatOperand.

Another more difficult pattern is
    Soften float result 0: t4: f128 = fmul t2, ConstantFP:f128
followed by
    Expand integer result: t5: i128 = bitcast t4

If SoftenFloatResult does not call ReplaceValueWith for t4,
ExpandIntegerResult will need to "replace" its operand with softened t4
or clone the whole BITCASE node. That will require more complicated changes
in ExpandRes_BITCAST(N, Lo, Hi) and ExpandIntegerResult.
Right now, in my patch I have only a simple condition inside
ExpandRes_BITCAST
to do nothing for fp128, because we can keep f128 and i128 in the same
register.

So, without ReplaceValueWith inside SoftenFloatResult, we not only need to
clone nodes in SoftenFloatOperands, but also in other kinds of "legalize"
functions like ExpandIntegerResult.

Comparing the two approaches:
(1) calling ReplaceValueWith of a node to save cloning of all user nodes
like I have now, or
(2) not calling ReplaceValueWith of a node but clone all user nodes when
that node is scanned as an operand.
I saw both patterns in llvm, but (1) seemed to be more efficient.
I think the <valid>Op_* functions exist because some operand nodes are not
transformed
to other "legal" type. Right now most "illegal" type node are transformed
to legal type,
but that won't be true for new x86_64 f128 type. If we use approach (2), we
will need to add many new <valid>Op_* functions and even change the
ExpandIntegerResult function.
To me, it is lower risk to depend on fewer <valid>Op_* functions to handle
f128/i128 values in register, than to depend on more <valid>Op_* and
<valid>Res_* functions to clone the new soften-yet-still-f128-type operands.

Beside this issue, do you see other area to change?

I have tried the part 1 subset mentioned in my previous comment.
That passed all current llvm unit tests and include the following files.
    include/llvm/Target/TargetLowering.h
    include/llvm/Target/TargetRegisterInfo.h
    lib/CodeGen/SelectionDAG/DAGCombiner.cpp
    lib/CodeGen/SelectionDAG/InstrEmitter.cpp
    lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
    lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
    lib/CodeGen/SelectionDAG/LegalizeTypes.cpp
    lib/CodeGen/SelectionDAG/LegalizeTypes.h
    lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
    lib/CodeGen/SelectionDAG/SelectionDAG.cpp
    lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
    lib/CodeGen/SelectionDAG/TargetLowering.cpp
    lib/CodeGen/TargetLoweringBase.cpp
    lib/CodeGen/TargetRegisterInfo.cpp
    lib/Target/X86/X86MCInstLower.cpp

If there is no other change, I will upload that part 1 change in a new
patch.
Thanks.


On Wed, Nov 25, 2015 at 4:25 PM, Xinliang David Li <davidxl at google.com>
wrote:

>
>
> On Tue, Nov 24, 2015 at 5:22 PM, Chih-Hung Hsieh <chh at google.com> wrote:
>
>> chh added a comment.
>>
>> David,
>>
>> I will upload a new patch with some of your older suggested changes first.
>> Please see details in the new code diff and upload comment.
>>
>> I met many difficulties implementing your latest inline comments
>> in LegalizeFloatTypes.cpp, so there is no change with that idea yet.
>> Please see my reply to the inline comment.
>>
>> For separating this large patch, I have not tried yet,
>> but it might be okay to split at least into two parts:
>> (2) The new unit test files *.ll and X86CallingConv.td,
>> X86ISelLowering.cpp, X86InstrInfo.td, X86InstrSSE.td, X86RegisterInfo.td.
>> (1) All the other files.
>>
>> We should be able to check in (1) and expect no change to all targets.
>> Patch (2) will fix Android x86_64 problem, but still no change to other
>> targets.
>> I am assuming that we want some time to test (1) before submitting (2).
>> Further split of (1) might be possible, but that would take more review
>> and test cycles if there is little regression. When there is any
>> regression,
>> it should be easy to revert all changes in (1) and then debug-and-retry,
>> right?
>> I don't see a way to split (2), as we don't want to have the Android
>> target
>> in a new incomplete state different from the current one.
>>
>>
>> ================
>> Comment at: lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp:158
>> @@ +157,3 @@
>> +    // When LegalInHWReg is true, we take a short-cut in
>> SoftenFloatOperand
>> +    // to save repeated computations and make the code simpler there.
>> +    // So here we need to call ReplaceValueWith.
>> ----------------
>> davidxl wrote:
>> > I don't quite like the short cut here either.  The original architected
>> flow is reasonable to me: basically
>> > 1) After softenfloat result, the old SDValue is mapped to the new result
>> > 2) the worklist based loop follow DU and use instrctions of the old
>> result get processed -- operands are scanned.
>> > 3) during scanning, the use of the old value will be replaced with the
>> new value.
>> > 4) if scan operands result in a new node, replaceValueWith then needs
>> to be eagerly called here on demand.
>> >
>> > I prefer you do not change the flow in SoftenFloatOperands and make a
>> patch here to workaround it. The resulting change may look bigger (as more
>> opcodes need to be handled), but it is what it is.
>> >
>> > I also suggest you extract the following patches out seperately
>> > 1) softenFloatResult and softenFloatOperands
>> > 2) f128 constantFP
>> > 3) register class small refactoring
>> > 4) the rest (may also be breakable )
>> The difficulties I found in the current architecture are
>> (1) Avoid duplicating a lot of code from SoftenFloatRes_* to
>> SoftenFloatOp_*.
>> (2) Make sure the changes in multiple SoftenFloatRes_* and
>> SoftenFloatOp_* will be fixed correctly at the same time in the future.
>>
>> In SoftenFloatOperand, we can see that it calls ReplaceValueWith
>> at the end for all kinds of opcode. The number of calls to
>> ReplaceValueWith
>> is not as many as we expected. Hence, it seems to me the norm is to share
>> code at higher level than pushing down into SoftenFloatOp_* and
>> SoftenFloatRes_*.
>> It seems reasonable to call ReplaceValueWith at the end of
>> SoftenFloatOperand and SoftenFloatResult.
>>
>
>
> You can't compare SoftenFloatOperand with SoftenFloatResult like this.
> SoftenFloatResult is for the value producer (defs) -- there is no need
> eagerly call ReplaceValueWith because their consumer nodes will naturally
> follow and be processed (either as value producer or consumer with operand
> scanned).
>
> SoftenFloatOperands on the other hand, by definition, does not enable new
> consumer nodes to be ready in the worklist so it needs to  call
> ReplaceValueWith when the node changes.
>
> Anyway, I think we are getting close to a closure. Can you first carve out
> part-1 (as I proposed above) first (I assume it won't affect others).
>
> David
>
>
>>
>> I tried to remove this part, but the result was either
>> (a) calling ReplaceValuewith inside SoftenFloatRes_*, or
>> (b) copying many SoftenFloatRes_* code into SoftenFloatOp_* and then call
>> ReplaceValueWith in SoftenFloatOp_*.  The copied code is not all simple and
>> small.  For example, SoftenFloatRes_FCOPYSIGN is complicated, and I don't
>> see how simpler it is to duplicate it into SoftenFloatOp_FCOPYSIGN, or
>> abstract it out into a shared function.
>>
>> What you mentioned in another email earlier might be better to
>> move some SoftenFloatOp_* responsibility into SoftenFloatRes_*.
>> Then, we have less code duplication, and we need to make sure
>> that ReplaceValueWith is called whenever something is changed
>> by SoftenFloatRes_*. I could try that when I have more time.
>>
>> My concern (2) is more scaring. During my test of those code movement,
>> I made a few mistakes and the compiler could run and generate inefficient
>> code because some ReplaceValueWith call was missed or some fp128 was
>> not kept in register due to a short-cut was missed. So I would prefer
>> a way easier and smaller change to verify no no-impact to existing
>> targets,
>> by conditioned on isLegalInHWReg, even if some can be applied to other
>> targets.
>>
>>
>>
>>
>> http://reviews.llvm.org/D11438
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151201/0e4cbdd5/attachment.html>


More information about the llvm-commits mailing list