[PATCH] D15294: [x86] inline calls to fmaxf / llvm.maxnum.f32 using maxss (PR24475)
Sanjay Patel via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 14 15:44:44 PST 2015
spatel added a comment.
In http://reviews.llvm.org/D15294#310167, @zansari wrote:
> Quick high-level question : wouldn't it be better to pull the intermediate value out of the fmax to reduce the dependence chain?
Yes, that would be better.
Because I'm SSE dyslexic, I altered the test program in https://llvm.org/bugs/show_bug.cgi?id=24475 to check, and this is what I came up with:
__m128 maxnum = _mm_max_ss(v2, v1);
__m128 isnan1 = _mm_cmpunord_ss(v1, v1);
maxnum = _mm_blendv_ps(maxnum, v2, isnan1);
Which compiles to (AT&T syntax - should invert the dyslexia, but I still can't get it right):
vmaxss %xmm0, %xmm1, %xmm2 <--- if either input is NaN, xmm0 (v1) is returned
vcmpunordss %xmm0, %xmm0, %xmm0
vblendvps %xmm0, %xmm1, %xmm2, %xmm0 <--- if xmm0 (v1) is NaN, output xmm1 (v2); if not, output max or v1
I'll translate that to LLVM and update the patch. Thanks!
http://reviews.llvm.org/D15294
More information about the llvm-commits
mailing list