[PATCH] D11438: Fix x86_64 fp128 calling convention

Mon Oct 12 17:27:26 PDT 2015

chh added a comment.

This patch should change output of fp128 for Android x86_64 only,
to fix mainly the problem in https://llvm.org/bugs/show_bug.cgi?id=23897.

No regression test failure was found for other targets, but since I am not sure if all other targets have enough regression tests on fp128, I tried to make changes only depend on the new configuration or work for all configurations.

It's possible to break this patch into several ones, but then only the last one should turn on the configuration change for Android x86_64. Then, all patches before the last one will be incomplete and not really tested. It's not right to change the Android x86_64 configuration before the last patch, because right now, fp128 for Android x86_64 works by itself, just not compatible with gcc. A partial change of the fp128 code will fail to compile libm. For example, the AsmPrinter change was actually discovered when libm was compiled with -g. The current code cannot handle new fp128 constant value.

The change was larger than typical bug patch because I think this is the first configuration to keep fp128 value in SSE registers. Before this, most x86 configurations used fp80 on floating point stack, and fp128 on x86 was split into two 64-bit registers. Hence, fp128 was passed in two 64-bit registers or on stack, but we need to pass them in SSE to be compatible with gcc and the ABI. Current nice legalization abstraction for floating point values worked mostly for most target, except a few cases that we have seen hacked for ppc. For fp128 in SSE registers, we need more enhancements to this legalization abstraction layer, or I would have to write up another new kind of action maybe 90% similar to soften floating point, but do something differently.

Some changes to ad-hoc optimizations related to fp128 were made only now because the existing code have not handled fp128 before. The fp128 values were split into two registers and/or converted to library function calls and do not hit the optimization code. Even the ppc f128 type is in two f64 registers. These were mostly found in the compilation of libm, which heavily cast fp128 to/from i128 to use faster bit-wise operations. I have added test cases for those problems I found. So hopefully if any of such changes were reverted, some regression test will fail.


================
Comment at: include/llvm/Target/TargetLowering.h:1907
@@ -1906,2 +1906,3 @@
+protected:
   ValueTypeActionImpl ValueTypeActions;
 
----------------
qcolombet wrote:
> That change makes me shiver.
> We probably do something against the design if we need to do that.
I think an alternative is to keep ValueTypeActions private here, but to hack into computeRegisterProperties of lib/CodeGen/TargetLoweringBase.cpp to define different actions for Android x86_64 fp128. That function now has one hack for ppcf128, but that is depending on a special MVT::ppcf128. If we add target dependent code into computeRegisterProperties, for only Android x86_64, not all f128, then that might look more like a hack than making ValueTypeActions accessible to child classes.


================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:8592
@@ +8591,3 @@
+  // instruction selection cannot handle FCOPYSIGN on
+  // SSE registers yet.
+  EVT N1VT = N1->getValueType(0);
----------------
qcolombet wrote:
> Based on the comment, this seems like a hack that impact all targets to fix a x86_64 backend limitation.
> The bottom line is that I am guessing we should push that change.
Yes, i think this impact all x86_64 f128 values in one register.
We didn't hit this problem because no target keeps f128 in one register.
What is the alternative change?


================
Comment at: lib/CodeGen/SelectionDAG/InstrEmitter.cpp:147
@@ -143,1 +146,3 @@
+              // Until that behavior of FR128 is changed,
+              // we need to handle such case later before using UseRC.
               // If multiple uses expect disjoint register classes, we emit
----------------
qcolombet wrote:
> By definition the common sub class of A and B must be able to hold both the types of A and B, since the related registers are both in A and in B (i.e., could be access by A or B register names).
> The bottom line is that comment must be always wrong and the target must do the right thing to ensure this is true, IMO.
I can think about this more. I was putting off that kind of change to getCommonSubClass as a TODO in the future, but that was more than a month ago.


================
Comment at: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1205
@@ -1190,1 +1204,3 @@
+              TargetLowering::TypeLegal ||
+            TLI.isTypeLegal(Node->getValueType(i))) &&
            "Unexpected illegal type!");
----------------
qcolombet wrote:
> How could that be different?!
> That looks wrong to have to check for both conditions.
This is needed only for the new special configuration of Android x86_64 fp128, which is TypeLegal to be in a register, but with a type action to soften the operations to library calls. It's TypeAction is not TypeLegal.



http://reviews.llvm.org/D11438