[PATCH] D26660: [X86] Remove the scalar intrinsics for fadd/fsub/fdiv/fmul
Zvi Rackover via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 15 00:13:14 PST 2016
zvi added inline comments.
================
Comment at: test/CodeGen/X86/vec_ss_load_fold.ll:41
+; X32_AVX1-NEXT: vmulss LCPI0_1, %xmm0, %xmm0
+; X32_AVX1-NEXT: vblendps {{.*#+}} xmm0 = xmm0[0],xmm1[1,2,3]
+; X32_AVX1-NEXT: vminss LCPI0_2, %xmm0, %xmm0
----------------
craig.topper wrote:
> zvi wrote:
> > This redundant blend should be documented in Bugzilla. It would be best to fix this before committing this patch.
> That blend exists because there is a vzmovl created from the inserts of 0s that pushed up to here and was then blocked by the min/max nodes. I can't pattern match it out.
>
> We need some sort of demanded elements filtering that figures out vcvttss2si doesn't want the upper bits and that the min/max pass the bits straight through and thus don't want the bits either. And push that all the way back to remove the original insert elements. Or something like that.
>
> I'll file a bug, but I don' think it should block a patch that was just trying to remove an intrinsic that clang doesn't use. I could write this same test case in clang without this instrinsic and see the same extra blend.
Thanks
https://reviews.llvm.org/D26660
More information about the llvm-commits
mailing list