[PATCH] D25020: [ARM] Fix 26% performance regression on Cortex-A9 caused by not using VMLA/VMLS

Wed Oct 12 05:51:29 PDT 2016

eastig added inline comments.

================
Comment at: lib/Target/ARM/ARMISelDAGToDAG.cpp:443
+    break;
+  }
+
----------------
eastig wrote:
> rovka wrote:
> > This is now checking only that the node can be lowered to VMLx. What happened to the part checking if forwarding can be used (i.e. mac following multiply or mac) and all the other checks?
> I removed FMA checks because of the following reasons:
> # FMA is lowered either VFMA or a library call if a target does not support VFPv4.
> # I have not found any information about accumulator forwarding for VFMA.
> 
> I removed other checks because I could not write tests for them. Are there cases when they are false?
Another point is that ARMDAGToDAGISel::hasNoVMLxHazardUse is only called instructions are combined into VMLx:

ARMInstrInfo.td:

```
// An 'fadd' node which checks for single non-hazardous use.
def fadd_mlx : PatFrag<(ops node:$lhs, node:$rhs),(fadd node:$lhs, node:$rhs),[{
  return hasNoVMLxHazardUse(N);
}]>;

// An 'fsub' node which checks for single non-hazardous use.
def fsub_mlx : PatFrag<(ops node:$lhs, node:$rhs),(fsub node:$lhs, node:$rhs),[{
  return hasNoVMLxHazardUse(N);
}]>;
```

https://reviews.llvm.org/D25020