[PATCH] D12635: merge vector stores into wider vector stores and fix AArch64 misaligned access TLI hook (PR21711)

Fri Sep 11 14:30:17 PDT 2015

spatel added inline comments.

================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:804
@@ +803,3 @@
+    // On Cyclone, unaligned 128-bit stores are slow.
+    *Fast = !Subtarget->isCyclone() || VT.getSizeInBits() != 128 ||
+            // See comments in performSTORECombine() for more details about
----------------
arsenm wrote:
> Pedantry: using VT.getStoreSize() would be preferable 
Ok. 

Note: I was misled by the 'store' in the name at first since we're handling both loads and stores here, but I see now that that function is not really about stores specifically.

================
Comment at: test/CodeGen/AArch64/merge-store.ll:30
@@ +29,3 @@
+
+define void @merge_vec_extract_stores(<4 x float> %v1, <2 x float>* %ptr) {
+  %idx0 = getelementptr inbounds <2 x float>, <2 x float>* %ptr, i64 3
----------------
arsenm wrote:
> Can you add some more test cases with different combination sized vectors? I would like to see a testcase that might try to combine multiple 3x vectors
I've been trying to find a way to do this, but all attempts so far have been thwarted because things like a v3f32 are not a simple type, so MergeConsecutiveStores doesn't get very far. Do you have a specific pattern that you're thinking of? Keep in mind that this patch limits vector merging only to extracted elements of a vector, so we're not even handling loads yet.

http://reviews.llvm.org/D12635