[PATCH] D21379: [X86] Heuristic to selectively build Newton-Raphson SQRT estimation

Nikolai Bozhenov via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 15 07:42:05 PDT 2016


n.bozhenov added a comment.

Below are some figures to justify the change.

Latency/throughput data from Architecture Optimization Manual:

  |      |  IVB |  HSW  |   BDW | SKL  |
  |------+------+-------+-------+------|
  | x32  | 14/7 |  13/7 |  13/4 | 13/3 |
  | x128 | 14/7 |  13/7 |  13/7 | 13/3 |
  | x256 |      | 19/13 | 19/13 | 12/6 |

Experimental Newton-Raphson efficiency for latency-bound code:

  |      |  IVB |  HSW |  BDW |  SKL |
  |------+------+------+------+------|
  | x32  | -41% | -40% | -21% | -40% |
  | x128 | -32% | -32% | -17% | -35% |

Experimental Newton-Raphson efficiency for throughput-bound code:

  |      |  IVB |  HSW |  BDW |  SKL |
  |------+------+------+------+------|
  | x32  | +18% | +21% | -17% | -40% |
  | x128 | +10% | +14% | +28% | -50% |
  | x256 |      | +68% | +85% |  +3% |


http://reviews.llvm.org/D21379





More information about the llvm-commits mailing list