[PATCH] D127158: [AArch64] Add intrinsic support for gpr<->fpr flavors of fixed-point converts

Wed Jun 8 14:09:51 PDT 2022

dmgreen added a comment.

> @dmgreen @fpetrogalli can you confirm whether the spec is right?

Thanks for the info. I've asked internally if anyone has a clear memory. It does look like GCC produces the x/w variants: https://godbolt.org/z/1GqeGv9xn

> The main goal is just to expose these instructions to anyone who might want them, since they expose functionality that isn't available with the existing intrinsic support.

OK I see. My main reason for asking was whether the `fp_to_si(fmul(x, C))` form would be acceptable to your use-case. Compared to the @llvm.aarch64.neon... intrinsics, whilst not always perfectly identical, do have certain benefits. It depends what the user is after, but the plain IR instructions benefit from all the constantfolding/range analysis/vectorization/etc that can happen in the mid-end, where the neon intrinsics often remain as black-boxes to optimizations.  The intrinsics would probably be more accurately specified as `fptosi_sat(fmul(x, C))`, so long as the constants were precise, but I'm not sure if there is lowering for that yet.

If the `fptosi_sat(fmul(x, C))` form is precisely equivalent to the intrinsics, my opinion would be to remove the @llvm.aarch64.neon.vcvt.. intrinsics entirely and reply on pure codegen. You always run into the possibility that the compiler may produce a worse result by mis-optimizing, but the chances of improvement usually outweigh down sides. (I'm not sure if it works in all cases though, if 2^16 can't be represented as a fp16).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127158/new/

https://reviews.llvm.org/D127158