[PATCH] D26790: [X86] Add a hasOneUse check to selectScalarSSELoad to keep the same load from being folded multiple times
Sanjay Patel via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 17 12:39:53 PST 2016
spatel added a comment.
In https://reviews.llvm.org/D26790#598894, @craig.topper wrote:
> Another possible fix is to lower the instrinsics to a scalar max SDNode with inserts and extracts around it like this (insert_vector_elt src1 (X86max (extract_vector_elt src1, 0), (extract_vector_elt src2, 0)), 0) Then pattern match it back to the min/max intrinsic instructions. This would be equivalent to how clang emits the FADD/FSUB/FMUL/FDIV intrinsics. We would need to do this for every pattern that currently uses sse_load_f32/f64. This would probably also fix PR31032 so maybe its worth doing?
That's what I was imagining - just so we try to standardize on a path for handling various opcodes and scalar vs. vector. But it's probably not possible for all opcodes/intrinsics.
Given Andrea's comment that we can undo this to avoid spilling, I have no objections to the patch.
But should this be or is this already gated when optimizing for size?
https://reviews.llvm.org/D26790
More information about the llvm-commits
mailing list