[PATCH] combine consecutive subvector 16-byte loads into one 32-byte load (PR21709)

Sanjay Patel spatel at rotateright.com
Tue Dec 2 18:03:20 PST 2014


Hi craig.topper, delena, andreadb,

This is a partial fix for PR21709 ( http://llvm.org/bugs/show_bug.cgi?id=21709 ). When we have 2 consecutive 16-byte loads that are merged into one 32-byte vector, we can use a single 32-byte load instead. But we don't do this for SandyBridge / IvyBridge because they have slower 32-byte memops.

I'm not too confident in my tablegen skills yet, so I've left the full list of pattern possibilities out of this patch until I get this one right. There's a 'TODO' comment where I think we'll want to add more patterns.

http://reviews.llvm.org/D6492

Files:
  lib/Target/X86/X86InstrInfo.td
  lib/Target/X86/X86InstrSSE.td
  test/CodeGen/X86/unaligned-32-byte-memops.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D6492.16841.patch
Type: text/x-patch
Size: 2694 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141203/8a013814/attachment.bin>


More information about the llvm-commits mailing list