[PATCH] merge consecutive 16-byte loads into one 32-byte load (PR22329)

Michael Kuperstein michael.m.kuperstein at intel.com
Sun Feb 1 06:37:53 PST 2015

Comment at: lib/Target/X86/X86ISelLowering.cpp:6099
@@ -6096,3 +6098,3 @@
     if (isAfterLegalize &&
         !DAG.getTargetLoweringInfo().isOperationLegal(ISD::LOAD, VT))
If I'm reading this correctly, before this change, if we got here, then the size of VT always matched the size of the found consecutive load (VT.getSizeInBits() == EltVt.getSizeInBits() * NumElems). 
With this change, I think that no longer holds. The size of the consecutive load we find is LdVT.getSizeInBits() * Elts.size(), but there's no guarantee that this is actually the size of VT. The responsibility for ensuring this condition holds has moved to the caller.

I think we now need an additional check that the sizes indeed match.

Comment at: lib/Target/X86/X86ISelLowering.cpp:13216
@@ +13215,3 @@
+  // --> load32 addr
+  if (Vec.getOpcode() == ISD::INSERT_SUBVECTOR &&
+      OpVT.is256BitVector() &&
You probably also want to check that Idx is what you expect it to be.



More information about the llvm-commits mailing list