[PATCH] D26790: [X86] Add a hasOneUse check to selectScalarSSELoad to keep the same load from being folded multiple times

Thu Nov 17 12:39:53 PST 2016

spatel added a comment.

In https://reviews.llvm.org/D26790#598894, @craig.topper wrote:

> Another possible fix is to lower the instrinsics to a scalar max SDNode with inserts and extracts around it like this   (insert_vector_elt src1 (X86max (extract_vector_elt src1, 0), (extract_vector_elt src2, 0)), 0)    Then pattern match it back to the min/max intrinsic instructions. This would be equivalent to how clang emits the FADD/FSUB/FMUL/FDIV intrinsics.  We would need to do this for every pattern that currently uses sse_load_f32/f64. This would probably also fix PR31032 so maybe its worth doing?

That's what I was imagining - just so we try to standardize on a path for handling various opcodes and scalar vs. vector. But it's probably not possible for all opcodes/intrinsics.

Given Andrea's comment that we can undo this to avoid spilling, I have no objections to the patch.

But should this be or is this already gated when optimizing for size?

https://reviews.llvm.org/D26790