[PATCH] D26855: New unsafe-fp-math implementation for X86 target

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 26 10:39:27 PST 2017


RKSimon added a comment.

In https://reviews.llvm.org/D26855#657735, @Gerolf wrote:

> I think the only issue that needs to be addressed is (finally!) sharing perf data. This has been raised at least 3 times. The possible compile-time implication, the speciality of the application (fast-math) etc are well understood.
>
> Gerolf


As I understand it the idea is that by moving this to the MC, then these alternative patterns will only be used if (1) the fast-math code permits it and (2) that the target cpu's scheduler model indicates that its quicker? So what you are asking is that we time the two versions of the code on specific cpus to check if in each case the correct decision is made?

This probably means that the tests should be updated to check against a couple of specific target cpus as well - we're limited by what x86 schedulers we have as but I know Jaguar (btver2) should use the rcpps version in all cases and expect Haswell should use divps.

A quick look at the SandyBridge scheduler model suggests its latency for FDIV is too low (especially ymm as it only has a 128-bit div alu) so that will select divps when it probably shouldn't....


https://reviews.llvm.org/D26855





More information about the llvm-commits mailing list