[PATCH] [X86] tranform insertps to blendps when possible for better performance
spatel at rotateright.com
Wed Mar 4 07:55:27 PST 2015
> I think we just need to change the SSE intrinsics to use generic shuffle IR rather than intrinsics. We shouldn't be worrying about
> re-combining the LLVM instruction intrinsics in the backend to speed things up. We should insist that code use generic IR as input if
> they want this kind of combining.
That's the 2nd suggestion I've gotten to rework the intrinsics in a week, so I guess I can't ignore that angle any longer. :)
So yes, let's put this patch on hold and see what happens when we start sending more generic shuffles down the line.
More information about the llvm-commits