[PATCH] D21379: [X86] Heuristic to selectively build Newton-Raphson SQRT estimation
Nikolai Bozhenov via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 15 07:42:05 PDT 2016
n.bozhenov added a comment.
Below are some figures to justify the change.
Latency/throughput data from Architecture Optimization Manual:
| | IVB | HSW | BDW | SKL |
|------+------+-------+-------+------|
| x32 | 14/7 | 13/7 | 13/4 | 13/3 |
| x128 | 14/7 | 13/7 | 13/7 | 13/3 |
| x256 | | 19/13 | 19/13 | 12/6 |
Experimental Newton-Raphson efficiency for latency-bound code:
| | IVB | HSW | BDW | SKL |
|------+------+------+------+------|
| x32 | -41% | -40% | -21% | -40% |
| x128 | -32% | -32% | -17% | -35% |
Experimental Newton-Raphson efficiency for throughput-bound code:
| | IVB | HSW | BDW | SKL |
|------+------+------+------+------|
| x32 | +18% | +21% | -17% | -40% |
| x128 | +10% | +14% | +28% | -50% |
| x256 | | +68% | +85% | +3% |
http://reviews.llvm.org/D21379
More information about the llvm-commits
mailing list