[PATCH] D26855: New unsafe-fp-math implementation for X86 target

Gerolf Hoflehner via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 21 14:36:32 PST 2016


Gerolf added a comment.

A few general remarks:

- I'm very much in favor of the MC combiner approach
- But I'm getting increasingly concerned about the MC code quality. I felt the FMA/Aarch64 support starts looking crabby (I'm the culprit) and this doesn't look much  better I'm afraid. The approach is in need of more automation for better maintainability. This is meant as food for thought. Since I"m too blame for the approach I can't to be harsh in reviews :-)
- I'm also concerned about the compile-time (in particular since we don't track x86 specific issues ( or any other backend than Aarch64- or least I'm not aware that anyone is watching closely). Could you share some specific data about your perf gains and compile-time measurements? However, I think this optimization is for fast math only and compile-time is probably less of an issue in that mode. One way to deal with compile-time issues is to wrap some MC under an option.
- Perhaps I missed it but I expected the optimization to kick in only under fast math. I saw 'fast' in the test cases, but didn't see a check in the code.

Thanks
Gerolf



================
Comment at: include/llvm/CodeGen/MachineInstr.h:1154
+  // those instructions because disassembler needs access to such things like
+  // Target Instruction Info. To support the access we included P argument below
+  void print(raw_ostream &OS, bool SkipOpers = false,
----------------
MBB instead of P? Is this for debugging? 


================
Comment at: lib/CodeGen/MachineCombiner.cpp:158
+        int UseIdx = InstrPtr->findRegisterUseOperandIdx(MO.getReg());
+        if (DefIdx < 0 || UseIdx < 0)
+          LatencyOp = TII->defaultDefLatency(SchedModel, *DefInstr);
----------------
Please add a comment that explains why default is used.

+ assert(DefIdx || UseIdx);


================
Comment at: lib/CodeGen/MachineInstr.cpp:1712
   const TargetIntrinsicInfo *IntrinsicInfo = nullptr;
+  const MachineBasicBlock *MBB = P ? P : getParent();
 
----------------
you can use the new parameter and then if (!MBB) MBB=getParent()


================
Comment at: lib/Target/X86/X86InstrInfo.cpp:9426
+  // ExactlyOne value we'll use it
+  for (auto &MBB : MF) {
+    for (auto &MI : MBB) {
----------------
This is an expensive search. There must be a direct simpler way to get that info.


================
Comment at: lib/Target/X86/X86InstrInfo.cpp:9477
+
+/// genFDivReciprocal - Generates A = B * 1/C instead of A = B/C
+///
----------------
For which types? All FP?


================
Comment at: lib/Target/X86/X86InstrInfo.cpp:9482
+/// X[0] = reciprocal (C);
+/// X[i+1] = X[i] + X[i] * (1 - C * X[i]); every iteration increases precision
+///
----------------
Can you elaborate? eg. on #iterations?


================
Comment at: lib/Target/X86/X86InstrInfo.cpp:9529
+  if (Iterations < 0)
+    Iterations = 1; // undefined value of iterations should produce 1 iteration
+
----------------
Why?


https://reviews.llvm.org/D26855





More information about the llvm-commits mailing list