[llvm] [X86] LowerSelect - use BLENDV for scalar selection if not all operands are multi use (PR #125853)
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 6 05:56:17 PST 2025
RKSimon wrote:
> > Changes to sse-minmax.ll - Ill push the diff (tmp commit for review - I'll remove it again later)
>
> We assume move have negligible cost in uarch and the total instrcution count is not increased. Why it is not prefered?
Not all uarchs form the SSE4 era had move elimination, and often the BLENDV instructions were 2 uops or more - so the total uop count could increase if the 3 x 1uop logic ops (+maybe 1uop move for the ANDNP mask) were replaced with 3 x 1uop moves + 1 x 2uop BLENDV - that's the worse case scenario. But we already always take that chance with BLENDV for vector select, its just the scalar selects that for some reason we were more cautious. I was trying to find a compromise, but I'm not against dropping the multiuse limit for SSE4 entirely.
https://github.com/llvm/llvm-project/pull/125853
More information about the llvm-commits
mailing list