[llvm] r232954 - [ARM] Add more pattern matching for f16 <-> f64 conversions

Ahmed Bougacha ahmed.bougacha at gmail.com
Fri Mar 27 09:55:22 PDT 2015


On Mon, Mar 23, 2015 at 9:49 AM, Bradley Smith <bradley.smith at arm.com> wrote:
> You're right, I hadn't considered this. I agree it makes more sense to fix the whole thing in Clang, so I'll revert my change and let you fix Clang's codegen instead. Thanks for pointing this out!

The clang part finally went in a couple days ago, as r232968.  Let me
know if there's more needed for fp16!

Thanks,
-Ahmed

> Regards,
> Bradley Smith
>
>> -----Original Message-----
>> From: Ahmed Bougacha [mailto:ahmed.bougacha at gmail.com]
>> Sent: 23 March 2015 16:11
>> To: Bradley Smith
>> Cc: LLVM Commits
>> Subject: Re: [llvm] r232954 - [ARM] Add more pattern matching for f16
>> <-> f64 conversions
>>
>> These aren't equivalent, are they?  AFAICT, VCVTB.F16.F64 (F8.1.65
>> ARMARMv8 0487A.e) does single-rounding, but the matched IR does
>> double-rounding, which yields different results in some cases.
>>
>> The double-rounding generated by clang is the root problem, and is
>> supposed to be fixed with D4602/D8367.  I fixed a DAGCombine that does
>> pretty much this (r228911) some time ago, while waiting for the clang
>> change to be approved.
>>
>> The F16->F64 part is fine, but for the sake of consistency, and given
>> the F64->F16 part is wrong, I'd rather we just fix the clang CodeGen
>> and revert this.  How does that sound?
>>
>> Thanks!
>> -Ahmed
>>
>>
>> On Mon, Mar 23, 2015 at 8:59 AM, Bradley Smith <bradley.smith at arm.com>
>> wrote:
>> > Author: brasmi01
>> > Date: Mon Mar 23 10:59:54 2015
>> > New Revision: 232954
>> >
>> > URL: http://llvm.org/viewvc/llvm-project?rev=232954&view=rev
>> > Log:
>> > [ARM] Add more pattern matching for f16 <-> f64 conversions
>> >
>> > Specifically when the conversion is done in two steps, f16 -> f32 ->
>> f64.
>> >
>> > For example:
>> >
>> > %1 = tail call float @llvm.convert.from.fp16.f32(i16 %0)
>> > %conv = fpext float %1 to double
>> >
>> > to:
>> >
>> > vcvtb.f64.f16
>> >
>> > Added:
>> >     llvm/trunk/test/CodeGen/ARM/fp16-64.ll
>> > Modified:
>> >     llvm/trunk/lib/Target/ARM/ARMInstrVFP.td
>> >
>> > Modified: llvm/trunk/lib/Target/ARM/ARMInstrVFP.td
>> > URL: http://llvm.org/viewvc/llvm-
>> project/llvm/trunk/lib/Target/ARM/ARMInstrVFP.td?rev=232954&r1=232953&r
>> 2=232954&view=diff
>> >
>> =======================================================================
>> =======
>> > --- llvm/trunk/lib/Target/ARM/ARMInstrVFP.td (original)
>> > +++ llvm/trunk/lib/Target/ARM/ARMInstrVFP.td Mon Mar 23 10:59:54 2015
>> > @@ -627,6 +627,14 @@ def : Pat<(f16_to_fp GPR:$a),
>> >  def : Pat<(f64 (f16_to_fp GPR:$a)),
>> >            (VCVTBHD (COPY_TO_REGCLASS GPR:$a, SPR))>;
>> >
>> > +def : Pat<(f64 (fextend (f16_to_fp GPR:$a))),
>> > +          (VCVTBHD (COPY_TO_REGCLASS GPR:$a, SPR))>,
>> > +          Requires<[HasFPARMv8, HasDPVFP]>;
>> > +
>> > +def : Pat<(fp_to_f16 (fround (f64 DPR:$a))),
>> > +          (i32 (COPY_TO_REGCLASS (VCVTBDH DPR:$a), GPR))>,
>> > +          Requires<[HasFPARMv8, HasDPVFP]>;
>> > +
>> >  multiclass vcvt_inst<string opc, bits<2> rm,
>> >                       SDPatternOperator node = null_frag> {
>> >    let PostEncoderMethod = "", DecoderNamespace = "VFPV8" in {
>> >
>> > Added: llvm/trunk/test/CodeGen/ARM/fp16-64.ll
>> > URL: http://llvm.org/viewvc/llvm-
>> project/llvm/trunk/test/CodeGen/ARM/fp16-64.ll?rev=232954&view=auto
>> >
>> =======================================================================
>> =======
>> > --- llvm/trunk/test/CodeGen/ARM/fp16-64.ll (added)
>> > +++ llvm/trunk/test/CodeGen/ARM/fp16-64.ll Mon Mar 23 10:59:54 2015
>> > @@ -0,0 +1,31 @@
>> > +; RUN: llc -mtriple=arm -mattr=+fp-armv8 < %s | \
>> > +; RUN:   FileCheck --check-prefix=CHECK --check-prefix=V8 %s
>> > +; RUN: llc -mtriple=arm -mattr=+vfp3,+d16 < %s | \
>> > +; RUN:   FileCheck --check-prefix=CHECK --check-prefix=NOV8 %s
>> > +
>> > +declare float @llvm.convert.from.fp16.f32(i16) nounwind readnone
>> > +declare i16 @llvm.convert.to.fp16.f32(float) nounwind readnone
>> > +
>> > +define void @vcvt_f64_f16(i16* %x, double* %y) nounwind {
>> > +entry:
>> > +; CHECK-LABEL: vcvt_f64_f16
>> > +  %0 = load i16, i16* %x, align 2
>> > +  %1 = tail call float @llvm.convert.from.fp16.f32(i16 %0)
>> > +  %conv = fpext float %1 to double
>> > +; CHECK-V8: vcvtb.f64.f16
>> > +; CHECK-NOV8-NOT: vcvtb.f64.f16
>> > +  store double %conv, double* %y, align 8
>> > +  ret void
>> > +}
>> > +
>> > +define void @vcvt_f16_f64(i16* %x, double* %y) nounwind {
>> > +entry:
>> > +; CHECK-LABEL: vcvt_f16_f64
>> > +  %0 = load double, double* %y, align 8
>> > +  %conv = fptrunc double %0 to float
>> > +; CHECK-V8: vcvtb.f16.f64
>> > +; CHECK-NOV8-NOT: vcvtb.f16.f64
>> > +  %1 = tail call i16 @llvm.convert.to.fp16.f32(float %conv)
>> > +  store i16 %1, i16* %x, align 2
>> > +  ret void
>> > +}
>> >
>> >
>> > _______________________________________________
>> > llvm-commits mailing list
>> > llvm-commits at cs.uiuc.edu
>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
>



More information about the llvm-commits mailing list