[PATCH] D15741: [X86] Avoid folding scalar loads into unary sse intrinsics

Wed Dec 23 03:46:25 PST 2015

mkuper created this revision.
mkuper added reviewers: RKSimon, spatel, andreadb.
mkuper added subscribers: llvm-commits, DavidKreitzer.

Not folding these cases tends to avoid partial register updates:

```
sqrtss (%eax), %xmm0
```
Has a partial update of %xmm0, while

```
movss (%eax), %xmm0
sqrtss %xmm0, %xmm0
```
Has a clobber of the high lanes immediately before the partial update, avoiding a potential stall.

Given this, we only want to fold when optimizing for size.
This is consistent with the patterns we already have for the fp/int converts, and in X86InstrInfo::foldMemoryOperandImpl()

http://reviews.llvm.org/D15741

Files:
  lib/Target/X86/X86InstrSSE.td
  test/CodeGen/X86/fold-load-unops.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D15741.43519.patch
Type: text/x-patch
Size: 8483 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151223/049379d6/attachment.bin>