[PATCH] D15294: [x86] inline calls to fmaxf / llvm.maxnum.f32 using maxss (PR24475)

Mon Dec 14 15:44:44 PST 2015

spatel added a comment.

In http://reviews.llvm.org/D15294#310167, @zansari wrote:

> Quick high-level question : wouldn't it be better to pull the intermediate value out of the fmax to reduce the dependence chain?

Yes, that would be better.

Because I'm SSE dyslexic, I altered the test program in https://llvm.org/bugs/show_bug.cgi?id=24475 to check, and this is what I came up with:

  __m128 maxnum = _mm_max_ss(v2, v1);
  __m128 isnan1 = _mm_cmpunord_ss(v1, v1);
  maxnum = _mm_blendv_ps(maxnum, v2, isnan1);

Which compiles to (AT&T syntax - should invert the dyslexia, but I still can't get it right):

  vmaxss        %xmm0, %xmm1, %xmm2         <--- if either input is NaN, xmm0 (v1) is returned
  vcmpunordss   %xmm0, %xmm0, %xmm0
  vblendvps     %xmm0, %xmm1, %xmm2, %xmm0  <--- if xmm0 (v1) is NaN, output xmm1 (v2); if not, output max or v1

I'll translate that to LLVM and update the patch. Thanks!

http://reviews.llvm.org/D15294