[llvm-dev] [RFC] Half-Precision Support in the Arm Backends

Mon Dec 4 12:20:11 PST 2017

On 12/4/2017 6:44 AM, Sjoerd Meijer via llvm-dev wrote:
>
> Custom Lowering
> -------------------------
>
> Making f16 legal and not having native load/stores instructions available,
> (no FullFP16 support) means custom lowering loads/stores:
> 1) Since we don't have FP16 load/store instructions available, we create
>    integer half-word loads. I unfortunately need the FP16_TO_FP node here,
>    because that "models" creating an integer value, which is what we need
>    to create a "truncating i16" integer load instructions. Instead, of 
> using
>    FP16_TO_FP, I have tried BITCASTs, but this can lead to code generation
>    to stack loads/stores which I don't want.
> 2) Custom lowering f16 stores is very similar, and creates truncating
>    half-word integer stores.

Technically, there are no f16 load/store instructions, yes, but we can 
use NEON vdl1 and vst1 to get something roughly equivalent, right?

You probably want to custom-lower BITCAST instructions; the generic 
sequence emitted by the legalizer is pretty inefficient in most cases.

---

Overall, I think your approach makes sense.

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171204/4fc9092b/attachment.html>