[PATCH] D26855: New unsafe-fp-math implementation for X86 target
Gerolf Hoflehner via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 2 13:49:36 PST 2017
Gerolf added a comment.
>From my perspective the implementation is close and only requires a few minor changes.
The compile-time numbers I've seen so far are meaningless (wide variation, unclear if/how many times your code actually fires etc), but I'm not too concerned about O3 fast-math ct and would give it benefit of the doubt.
I did ask the question about performance benefits twice to no avail and admit I'm still curious. I assume to get these numbers you do set your machines into perf mode rather than using some servers running some random load.
Thanks
Gerolf
================
Comment at: lib/Target/X86/X86InstrInfo.cpp:9451
+static bool isDividendExactlyOne(MachineFunction &MF, unsigned DividendReg) {
+ // The dividend could be ExactlyOne value and in this case we should not
----------------
Add comment what the function does before the function
================
Comment at: lib/Target/X86/X86InstrInfo.cpp:9455
+ // additional constant for reciprocal division but use the dividend instead.
+ // We're trying to find the dividend definition and if it's a constant
+ // ExactlyOne value we'll use it
----------------
it's -> it is
================
Comment at: lib/Target/X86/X86InstrInfo.cpp:9523
+///
+/// vmovss xmm1, dword ptr [rip + .LCPI0_0] # xmm1 = mem[0],zero,zero,zero
+/// vrcpss xmm2, xmm0, xmm0
----------------
input to this function are 7 parameters, the comment only lists 6.
================
Comment at: lib/Target/X86/X86InstrInfo.cpp:9560
+ if (Iterations <
+ 0) // undefined means it was not defined explicitly via option
+ Iterations = 1; // or attribute that's why we use the default value
----------------
I keep stumbling over this comment and every time i read it: did you mean to say something like
//Execute at least one iteration.
Iternations = Max(1, Iterations);
================
Comment at: lib/Target/X86/X86InstrInfo.cpp:9563
+
+ // The bullets below (0,2,1,3,4,5,6) mean the indexes inside input Instrs
+ // 0: rcp
----------------
Where is that input sequence documented?
https://reviews.llvm.org/D26855
More information about the llvm-commits
mailing list