[PATCH] D13269: Improved X86-FMA3 mem-folding & coalescing

Wed Sep 30 14:37:00 PDT 2015

ab added a subscriber: ab.
ab requested changes to this revision.
ab added a reviewer: ab.
ab added a comment.
This revision now requires changes to proceed.

> Another way is to separate FMA opcodes generated for FP operations 

>  and FMA opcodes generated for FMA intrinsics as it is done now for ADD operations,

>  e.g. ADDSSrr vs ADDSSrr_Int. *_Int opcodes are handled more conservatively.

>  Being more conservative in commuting 1st and 2nd operands of scalar FMAs

>  right now seems better choice as stability/correctness has higher priority.

You're right, _Int would work (and is intended for exactly this situation), but I disagree that we can avoid fixing that here.  I'm probably the one who hates _Int the most, but currently, the fma scalar intrinsic patterns seem just plain wrong, and working around that here isn't proper, IMHO.  You should add the _Int instructions before landing this patch.

As for getting rid of _Int in the long term, we have https://llvm.org/bugs/show_bug.cgi?id=23449 !

http://reviews.llvm.org/D13269