[llvm] r328313 - [ARM] Support float literals under XO

Fri Mar 23 12:49:46 PDT 2018

On 3/23/2018 6:02 AM, Christof Douma via llvm-commits wrote:
> Author: christof
> Date: Fri Mar 23 06:02:03 2018
> New Revision: 328313
>
> URL: http://llvm.org/viewvc/llvm-project?rev=328313&view=rev
> Log:
> [ARM] Support float literals under XO
>
> When targeting execute-only and fp-armv8, float constants in a compare
> resulted in instruction selection failures. This is now fixed by using
> vmov.f32 where possible, otherwise the floating point constant is
> lowered into a integer constant that is moved into a floating point
> register.
>
> This patch also restores using fpcmp with immediate 0 under fp-armv8.

I can't find any previous discussion of this patch; where was it reviewed?

>
> Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443
>
> Added:
>      llvm/trunk/test/CodeGen/ARM/fcmp-xo.ll
> Modified:
>      llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
>      llvm/trunk/lib/Target/ARM/ARMISelLowering.h
>      llvm/trunk/lib/Target/ARM/ARMInstrVFP.td
>
> Modified: llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp?rev=328313&r1=328312&r2=328313&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp (original)
> +++ llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp Fri Mar 23 06:02:03 2018
> @@ -1283,6 +1283,7 @@ const char *ARMTargetLowering::getTarget
>     case ARMISD::VMOVDRR:       return "ARMISD::VMOVDRR";
>     case ARMISD::VMOVhr:        return "ARMISD::VMOVhr";
>     case ARMISD::VMOVrh:        return "ARMISD::VMOVrh";
> +  case ARMISD::VMOVSR:        return "ARMISD::VMOVSR";
>   
>     case ARMISD::EH_SJLJ_SETJMP: return "ARMISD::EH_SJLJ_SETJMP";
>     case ARMISD::EH_SJLJ_LONGJMP: return "ARMISD::EH_SJLJ_LONGJMP";
> @@ -4518,9 +4519,10 @@ SDValue ARMTargetLowering::LowerSELECT_C
>     bool InvalidOnQNaN;
>     FPCCToARMCC(CC, CondCode, CondCode2, InvalidOnQNaN);
>   
> -  // Try to generate VMAXNM/VMINNM on ARMv8.
> -  if (Subtarget->hasFPARMv8() && (TrueVal.getValueType() == MVT::f32 ||
> -                                  TrueVal.getValueType() == MVT::f64)) {
> +  // Try to generate VMAXNM/VMINNM on ARMv8. Except if we compare to a zero.
> +  // This ensures we use CMPFPw0 instead of CMPFP in such case.
> +  if (Subtarget->hasFPARMv8() && !isFloatingPointZero(RHS) &&
> +    (TrueVal.getValueType() == MVT::f32 || TrueVal.getValueType() == MVT::f64)) {
>       bool swpCmpOps = false;
>       bool swpVselOps = false;
>       checkVSELConstraints(CC, CondCode, swpCmpOps, swpVselOps);

The comment here makes no sense; this code doesn't even generate VMAXNM.

> @@ -1066,6 +1069,7 @@ def VMOVSR : AVConv4I<0b11100000, 0b1010
>     // pipelines.
>     let D = VFPNeonDomain;
>   }
> +def : Pat<(arm_vmovsr GPR:$Rt), (VMOVSR GPR:$Rt)>;

Does this need to check UseVMOVSR/DontUseVMOVSR?

>   
>   let hasSideEffects = 0 in {
>   def VMOVRRD  : AVConv3I<0b11000101, 0b1011,
>
> Added: llvm/trunk/test/CodeGen/ARM/fcmp-xo.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/fcmp-xo.ll?rev=328313&view=auto
> ==============================================================================
> --- llvm/trunk/test/CodeGen/ARM/fcmp-xo.ll (added)
> +++ llvm/trunk/test/CodeGen/ARM/fcmp-xo.ll Fri Mar 23 06:02:03 2018
> @@ -0,0 +1,118 @@
> +; RUN: llc -mtriple=thumbv7m-arm-none-eabi -mattr=+execute-only,+fp-armv8 %s -o - | FileCheck %s
> +
> +; This function used to run into a code selection error on fp-armv8 due to
> +; different ordering of the constant arguments of fcmp. Fixed by extending the
> +; code selection to handle the missing case.
> +define arm_aapcs_vfpcc void @foo0() local_unnamed_addr {
> +  br i1 undef, label %.end, label %1
> +
> +  %2 = fcmp nsz olt float undef, 0.000000e+00
> +  %3 = select i1 %2, float -5.000000e-01, float 5.000000e-01
> +  %4 = fadd nsz float undef, %3
> +  %5 = fptosi float %4 to i32
> +  %6 = ashr i32 %5, 4
> +  %7 = icmp slt i32 %6, 0
> +  br i1 %7, label %8, label %.end

Can you clean up these testcases a bit?  Some of these instructions look 
unnecessary.

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project