[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6
LuoYuanke via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 26 07:27:57 PDT 2021
LuoYuanke added inline comments.
================
Comment at: clang/test/CodeGen/X86/avx512fp16-builtins.c:4223
+
+// CFC ADD PH
+
----------------
MADD?
================
Comment at: clang/test/CodeGen/X86/avx512fp16-builtins.c:4315
+
+// CF ADD PH
+
----------------
MADD?
================
Comment at: llvm/include/llvm/IR/IntrinsicsX86.td:5732
+
+ def int_x86_avx512fp16_mask_vfcmaddc_ph_128
+ : GCCBuiltin<"__builtin_ia32_vfcmaddcph128_mask">,
----------------
_cph?
================
Comment at: llvm/include/llvm/IR/IntrinsicsX86.td:5796
+ [ IntrNoMem, ImmArg<ArgIndex<4>> ]>;
+ def int_x86_avx512fp16_mask_vfmaddc_sh
+ : GCCBuiltin<"__builtin_ia32_vfmaddcsh_mask">,
----------------
_csh?
================
Comment at: llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp:3902
+ case X86::VFCMADDCSHZr:
+ case X86::VFCMADDCSHZrb:
+ case X86::VFCMADDCSHZrbk:
----------------
"b" means rounding. Right?
================
Comment at: llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp:3948
+ for (unsigned i = 2; i < Inst.getNumOperands(); i++)
+ if (Inst.getOperand(i).isReg() && Dest == Inst.getOperand(i).getReg())
+ return Warning(Ops[0]->getStartLoc(), "Destination register should be "
----------------
Sorry, I didn't find the constrain in the spec.
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:47289
+ return 0;
+ if (RHS->getOpcode() == ISD::BITCAST && RHS.hasOneUse() &&
+ (RHS->getOperand(0)->getOpcode() == X86ISD::VFMULC ||
----------------
Can swap LHS and RHS reduce some redundant code?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:47296
+ };
+ int MulId = getMulId();
+ const TargetOptions &Options = DAG.getTarget().Options;
----------------
The lambda seems only be called once.
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:47298
+ const TargetOptions &Options = DAG.getTarget().Options;
+ if ((Options.AllowFPOpFusion == FPOpFusion::Fast || Options.UnsafeFPMath) &&
+ MulId < 2 && Subtarget.hasFP16() && IsAdd &&
----------------
Is it possible fast and non-fast instruction is mixed due to inline? Shall we check the instruction AllowContract flag?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:47359
+// t23: v16f32 = X86ISD::VFCMULC[X86ISD::VFMULC]
+// t8, t22
+// t24: v32f16 = bitcast t23
----------------
Merge it to previous line.
================
Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:5781
+ (MaskOpNode _.RC:$src1, (_.VT (_.BroadcastLdFrag addr:$src2))),
+ 0, 0, 0, ClobberConstraint>,
EVEX_4V, EVEX_B,
----------------
Moving ClobberConstraint before IsCommutable saves the code for default value?
================
Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:13604
+
+ defm VFMULCPH : avx512_cfmbinop_common<0xD6, "vfmulcph", x86vfmulc, x86vfmulc,
+ x86vfmulcRnd>, T_MAP6XS, EVEX_CD8<32, CD8VF>;
----------------
The name seems not accurate. Is it cfmop for mul and cfmaop for fma?
================
Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:13640
+ (v4f32 (OpNode VR128X:$src1, VR128X:$src2)),
+ 0, 0, 0, X86selects, "@earlyclobber $dst">, Sched<[sched.XMM]>;
+ defm rm : AVX512_maskable<opc, MRMSrcMem, f32x_info, (outs VR128X:$dst),
----------------
I didn't see this flag for other scalar instructions, why we need it for complex instruction?
================
Comment at: llvm/lib/Target/X86/X86InstrFoldTables.cpp:1852
+ { X86::VFCMULCPHZrr, X86::VFCMULCPHZrm, 0 },
+ { X86::VFCMULCSHZrr, X86::VFCMULCSHZrm, TB_NO_REVERSE },
{ X86::VFMADDPD4Yrr, X86::VFMADDPD4Ymr, 0 },
----------------
Why FR32X version is not needed for complex scalar instructions?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D105269/new/
https://reviews.llvm.org/D105269
More information about the llvm-commits
mailing list