[PATCH] D11438: Fix x86_64 fp128 calling convention

Chih-Hung Hsieh via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 24 17:22:44 PST 2015


chh added a comment.

David,

I will upload a new patch with some of your older suggested changes first.
Please see details in the new code diff and upload comment.

I met many difficulties implementing your latest inline comments
in LegalizeFloatTypes.cpp, so there is no change with that idea yet.
Please see my reply to the inline comment.

For separating this large patch, I have not tried yet,
but it might be okay to split at least into two parts:
(2) The new unit test files *.ll and X86CallingConv.td, X86ISelLowering.cpp, X86InstrInfo.td, X86InstrSSE.td, X86RegisterInfo.td.
(1) All the other files.

We should be able to check in (1) and expect no change to all targets.
Patch (2) will fix Android x86_64 problem, but still no change to other targets.
I am assuming that we want some time to test (1) before submitting (2).
Further split of (1) might be possible, but that would take more review
and test cycles if there is little regression. When there is any regression,
it should be easy to revert all changes in (1) and then debug-and-retry, right?
I don't see a way to split (2), as we don't want to have the Android target
in a new incomplete state different from the current one.


================
Comment at: lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp:158
@@ +157,3 @@
+    // When LegalInHWReg is true, we take a short-cut in SoftenFloatOperand
+    // to save repeated computations and make the code simpler there.
+    // So here we need to call ReplaceValueWith.
----------------
davidxl wrote:
> I don't quite like the short cut here either.  The original architected flow is reasonable to me: basically 
> 1) After softenfloat result, the old SDValue is mapped to the new result
> 2) the worklist based loop follow DU and use instrctions of the old result get processed -- operands are scanned.
> 3) during scanning, the use of the old value will be replaced with the new value.
> 4) if scan operands result in a new node, replaceValueWith then needs to be eagerly called here on demand.
> 
> I prefer you do not change the flow in SoftenFloatOperands and make a patch here to workaround it. The resulting change may look bigger (as more opcodes need to be handled), but it is what it is.
> 
> I also suggest you extract the following patches out seperately
> 1) softenFloatResult and softenFloatOperands
> 2) f128 constantFP
> 3) register class small refactoring
> 4) the rest (may also be breakable )
The difficulties I found in the current architecture are
(1) Avoid duplicating a lot of code from SoftenFloatRes_* to SoftenFloatOp_*.
(2) Make sure the changes in multiple SoftenFloatRes_* and SoftenFloatOp_* will be fixed correctly at the same time in the future.

In SoftenFloatOperand, we can see that it calls ReplaceValueWith
at the end for all kinds of opcode. The number of calls to ReplaceValueWith
is not as many as we expected. Hence, it seems to me the norm is to share
code at higher level than pushing down into SoftenFloatOp_* and SoftenFloatRes_*.
It seems reasonable to call ReplaceValueWith at the end of
SoftenFloatOperand and SoftenFloatResult.

I tried to remove this part, but the result was either
(a) calling ReplaceValuewith inside SoftenFloatRes_*, or
(b) copying many SoftenFloatRes_* code into SoftenFloatOp_* and then call ReplaceValueWith in SoftenFloatOp_*.  The copied code is not all simple and small.  For example, SoftenFloatRes_FCOPYSIGN is complicated, and I don't see how simpler it is to duplicate it into SoftenFloatOp_FCOPYSIGN, or abstract it out into a shared function.

What you mentioned in another email earlier might be better to
move some SoftenFloatOp_* responsibility into SoftenFloatRes_*.
Then, we have less code duplication, and we need to make sure
that ReplaceValueWith is called whenever something is changed
by SoftenFloatRes_*. I could try that when I have more time.

My concern (2) is more scaring. During my test of those code movement,
I made a few mistakes and the compiler could run and generate inefficient
code because some ReplaceValueWith call was missed or some fp128 was
not kept in register due to a short-cut was missed. So I would prefer
a way easier and smaller change to verify no no-impact to existing targets,
by conditioned on isLegalInHWReg, even if some can be applied to other targets.




http://reviews.llvm.org/D11438





More information about the llvm-commits mailing list