[PATCH] D42580: [ARM] Armv8.2-A FP16 code generation (part 2/3)

Mon Jan 29 08:37:39 PST 2018

SjoerdMeijer added a comment.

> It might be intended to only apply to function arguments and returns, but those patterns for f16_to_fp and fp_to_f16 could match anywhere.

Just checking to see if I understand this correctly.

I am matching this:

      t2: i32,ch = CopyFromReg t0, Register:i32 %0
    t7: i16 = truncate t2
  t8: f16 = bitcast t7

and with the custom lowering in this patch and using node fp16_to_fp,
I am generating this:

    t2: i32,ch = CopyFromReg t0, Register:i32 %0
  t18: f32 = fp16_to_fp t2

and using this rewrite pattern:

  def : Pat<(f16_to_fp GPR:$a), 
            (f32 (COPY_TO_REGCLASS GPR:$a, HPR))>;

results in moves from int to fp registers:

  vmov  s0, r1
  vmov  s2, r0
  vadd.f16  s0, s2, s0
  ...

That's what I meant with the comment:

  // We use FP16_TO_FP just to model a GPR -> HPR move

I got inspiration for this approach from e.g. existing test case:

  test/CodeGen/ARM/fp16-args.ll

which generates exactly the same DAG for its incoming half arguments:

    t2: i32,ch = CopyFromReg t0, Register:i32 %0
  t18: f32 = fp16_to_fp t2

Thus, I am (re)using the same approach, except that I not doing the convert when
FullFP16 is enabled. Is your concern that I am changing the semantics of these nodes because
I am omitting this convert? The "definition" of these nodes read:

  /// FP16_TO_FP, FP_TO_FP16 - These operators are used to perform promotions
  /// and truncation for half-precision (16 bit) floating numbers. These nodes
  /// form a semi-softened interface for dealing with f16 (as an i16), which
  /// is often a storage-only type but has native conversions.

I liked the "semi-softened interface" part here, because that's how I am using
it in a new context; I was/am reluctant to introduce a new node here.

https://reviews.llvm.org/D42580