<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>
</head>
<body dir="ltr">
<div id="divtagdefaultwrapper" style="font-size: 12pt; color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-serif, "EmojiFont", "Apple Color Emoji", "Segoe UI Emoji", NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", EmojiSymbols;" dir="ltr">
<p style="margin-top:0;margin-bottom:0">Thanks a lot for the suggestions! I will look into using vld1/vst1, sounds good.</p>
<p style="margin-top:0;margin-bottom:0">I am custom lowering the bitcasts, that's now the only place where FP_TO_FP16
<br>
</p>
<p style="margin-top:0;margin-bottom:0">and FP16_TO_FP nodes are created to avoid inefficient code generation. I will</p>
<p style="margin-top:0;margin-bottom:0">double check if I can't achieve the same without using these nodes (because I</p>
<p style="margin-top:0;margin-bottom:0">really would like to get completely rid of them).<br>
</p>
<p style="margin-top:0;margin-bottom:0"><br>
</p>
<p style="margin-top:0;margin-bottom:0">Cheers,</p>
<p style="margin-top:0;margin-bottom:0">Sjoerd.<br>
</p>
<p style="margin-top:0;margin-bottom:0"><br>
</p>
<pre>>On 12/4/2017 6:44 AM, Sjoerd Meijer via llvm-dev wrote:
>><i>
</i>>><i> Custom Lowering
</i>><i>> -------------------------
</i>>><i>
</i>>><i> Making f16 legal and not having native load/stores instructions available,
</i>>><i> (no FullFP16 support) means custom lowering loads/stores:
</i>>><i> 1) Since we don't have FP16 load/store instructions available, we create
</i>>><i> integer half-word loads. I unfortunately need the FP16_TO_FP node here,
</i>>><i> because that "models" creating an integer value, which is what we need
</i>>><i> to create a "truncating i16" integer load instructions. Instead, of
</i>>><i> using
</i>>><i> FP16_TO_FP, I have tried BITCASTs, but this can lead to code generation
</i>>><i> to stack loads/stores which I don't want.
</i>>><i> 2) Custom lowering f16 stores is very similar, and creates truncating
</i>>><i> half-word integer stores.
</i>>
>Technically, there are no f16 load/store instructions, yes, but we can
>use NEON vdl1 and vst1 to get something roughly equivalent, right?
>
>You probably want to custom-lower BITCAST instructions; the generic
>sequence emitted by the legalizer is pretty inefficient in most cases.
>
>---
>
>Overall, I think your approach makes sense.
</pre>
<br>
<div style="color: rgb(0, 0, 0);"></div>
</div>
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose,
or store or copy the information in any medium. Thank you.
</body>
</html>