[PATCH] combine consecutive subvector 16-byte loads into one 32-byte load (PR21709)

Sanjay Patel spatel at rotateright.com
Wed Dec 3 09:01:06 PST 2014


Hi Elena - thanks for your feedback! Updated patch attached.

I've added test cases that use shufflevector rather than the vinsertf128 intrinsic. This is derived from Andrea's alternate code example in the bug report. Does this cover the scenario you're thinking about using insertelement(s)? Unless I'm mistaken, we can only use insertelement with scalars, so this would be the expected pattern from IR that had insert/extract?

I've also added test cases that swap the order of the operands to the intrinsic and shuffles. Does this answer your concern about changing the order of the pattern parameters?

http://reviews.llvm.org/D6492

Files:
  lib/Target/X86/X86InstrInfo.td
  lib/Target/X86/X86InstrSSE.td
  test/CodeGen/X86/unaligned-32-byte-memops.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D6492.16867.patch
Type: text/x-patch
Size: 6191 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141203/28b3ded0/attachment.bin>


More information about the llvm-commits mailing list