[PATCH] D15741: [X86] Avoid folding scalar loads into unary sse intrinsics
Michael Kuperstein via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 23 03:46:25 PST 2015
mkuper created this revision.
mkuper added reviewers: RKSimon, spatel, andreadb.
mkuper added subscribers: llvm-commits, DavidKreitzer.
Not folding these cases tends to avoid partial register updates:
sqrtss (%eax), %xmm0
Has a partial update of %xmm0, while
movss (%eax), %xmm0
sqrtss %xmm0, %xmm0
Has a clobber of the high lanes immediately before the partial update, avoiding a potential stall.
Given this, we only want to fold when optimizing for size.
This is consistent with the patterns we already have for the fp/int converts, and in X86InstrInfo::foldMemoryOperandImpl()
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 8483 bytes
Desc: not available
More information about the llvm-commits