[PATCH] D107082: [X86][RFC] Enable `_Float16` type support on X86 following the psABI
LuoYuanke via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 8 02:01:24 PDT 2022
LuoYuanke added inline comments.
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:616
+ setOperationAction(ISD::FROUNDEVEN, MVT::f16, Promote);
+ setOperationAction(ISD::FP_ROUND, MVT::f16, Expand);
+ setOperationAction(ISD::FP_EXTEND, MVT::f32, Expand);
----------------
Just confused how to expand it. Will the expand fail and finally turns to libcall?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:763
+ if (isTypeLegal(MVT::f16)) {
+ setOperationAction(ISD::FP_EXTEND, MVT::f80, Custom);
+ setOperationAction(ISD::STRICT_FP_EXTEND, MVT::f80, Custom);
----------------
Why f16 emulation affect f80 type? Are we checking isTypeLegal(MVT::f80)?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:22100
SDValue Res;
+ if (SrcVT == MVT::f16 && !Subtarget.hasFP16()) {
+ if (IsStrict)
----------------
Not sure if it is better to wrapper it into a readable function (e.g., isSoftF16).
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:22448
+ if (SrcVT == MVT::f16)
+ return SDValue();
+
----------------
Why we don't extent to f32 here?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:22522
+ if (!isScalarFPTypeInSSEReg(SrcVT) ||
+ (SrcVT == MVT::f16 && !Subtarget.hasFP16()))
return SDValue();
----------------
Why we don't extent to f32 here? Will it be promoted finally?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:22765
+ DAG.getIntPtrConstant(0, DL));
+ Res = DAG.getNode(X86ISD::STRICT_CVTPS2PH, DL, {MVT::v8i16, MVT::Other},
+ {Chain, Res, DAG.getTargetConstant(4, DL, MVT::i32)});
----------------
Should MVT::v8i16 be MVT::v8f16?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:22766
+ Res = DAG.getNode(X86ISD::STRICT_CVTPS2PH, DL, {MVT::v8i16, MVT::Other},
+ {Chain, Res, DAG.getTargetConstant(4, DL, MVT::i32)});
+ Chain = Res.getValue(1);
----------------
Is it rounding control? Can we use a macro or add comments for what is the rounding control?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:22775
+
+ Res = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, MVT::i16, Res,
+ DAG.getIntPtrConstant(0, DL));
----------------
MVT::f16 and delete the bitcast?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:44211
VT != MVT::f80 && VT != MVT::f128 &&
+ !(VT.getScalarType() == MVT::f16 && !Subtarget.hasFP16()) &&
(TLI.isTypeLegal(VT) || VT == MVT::v2f32) &&
----------------
Not sure if it is better to wrapper it into a readable function (e.g., isSoftF16).
================
Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:1476
}
-let Predicates = [HasFP16] in {
+let Predicates = [HasBWI] in {
def : Pat<(v32f16 (X86VBroadcastld16 addr:$src)),
----------------
If target don't have avx512bw feature. There is some other pattern to lower the node or fp16 broadcast node is invalid?
================
Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:4107
_.ExeDomain>, EVEX_4V, Sched<[SchedWriteFShuffle.XMM]>;
+ let Predicates = [prd] in {
def rrkz : AVX512PI<0x10, MRMSrcReg, (outs _.RC:$dst),
----------------
Previous prd only apply to "def rr"? Is it a bug for previous code?
================
Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:4352
+defm : avx512_move_scalar_lowering<"VMOVSHZ", X86Movsh, fp16imm0, v8f16x_info>;
+defm : avx512_store_scalar_lowering<"VMOVSHZ", avx512vl_f16_info,
+ (v32i1 (bitconvert (and GR32:$mask, (i32 1)))), GR32>;
----------------
Why previous code don't have predicates?
================
Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:11657
+let Predicates = [HasBWI], AddedComplexity = -10 in {
+ def : Pat<(f16 (load addr:$src)), (COPY_TO_REGCLASS (VPINSRWZrm (v8i16 (IMPLICIT_DEF)), addr:$src, 0), FR16X)>;
----------------
Why set AddedComplexity to -10? There no such addtional complexity in previous code. Add comments for it?
================
Comment at: llvm/lib/Target/X86/X86InstrSSE.td:3970
+let Predicates = [UseSSE2], AddedComplexity = -10 in {
+ def : Pat<(f16 (load addr:$src)), (COPY_TO_REGCLASS (PINSRWrm (v8i16 (IMPLICIT_DEF)), addr:$src, 0), FR16)>;
----------------
Why AddedComplexity = -10? Add comments for it?
================
Comment at: llvm/lib/Target/X86/X86InstrSSE.td:3978
+let Predicates = [HasAVX, NoBWI], AddedComplexity = -10 in {
+ def : Pat<(f16 (load addr:$src)), (COPY_TO_REGCLASS (VPINSRWrm (v8i16 (IMPLICIT_DEF)), addr:$src, 0), FR16)>;
+ def : Pat<(i16 (bitconvert f16:$src)), (EXTRACT_SUBREG (VPEXTRWrr (v8i16 (COPY_TO_REGCLASS FR16:$src, VR128)), 0), sub_16bit)>;
----------------
Miss pattern for store?
================
Comment at: llvm/lib/Target/X86/X86InstrSSE.td:5214
+
+let Predicates = [HasAVX, NoBWI] in
+ def : Pat<(store f16:$src, addr:$dst), (VPEXTRWmr addr:$dst, (v8i16 (COPY_TO_REGCLASS FR16:$src, VR128)), 0)>;
----------------
Why no AddedComplexity for it?
================
Comment at: llvm/lib/Target/X86/X86RegisterInfo.td:540
+def FR16 : RegisterClass<"X86", [f16], 16, (add FR32)> {let Size = 32;}
+
----------------
pengfei wrote:
> skan wrote:
> > The alignment is not same as the size?
> No. This is spill size instead of alignment.
When there is avx512fp16 feature, is the spill size still 32?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D107082/new/
https://reviews.llvm.org/D107082
More information about the llvm-commits
mailing list