[PATCH] D22071: Correct ordering of loads/stores.

Thu Jul 7 11:18:58 PDT 2016

asbirlea added a comment.

I think most comments are addressed now. Let me know if I missed anything.


================
Comment at: lib/Transforms/Vectorize/LoadStoreVectorizer.cpp:97
@@ -95,3 +96,3 @@
 
   /// Returns the first and the last instructions in Chain.
   std::pair<BasicBlock::iterator, BasicBlock::iterator>
----------------
Formatted.

================
Comment at: lib/Transforms/Vectorize/LoadStoreVectorizer.cpp:343
@@ +342,3 @@
+  Worklist.insert(Worklist.end(), I);
+  while (Worklist.size()) {
+    auto LastElement = Worklist.end();
----------------
Used a work-list for the other method.

================
Comment at: test/Transforms/LoadStoreVectorizer/AMDGPU/insertion-point.ll:13
@@ -12,2 +12,3 @@
+; CHECK: %w = add i32 %y, 9
 ; CHECK: %foo = add i32 %z, %w
 define void @insert_load_point(float addrspace(1)* nocapture %a, float addrspace(1)* nocapture %b, float addrspace(1)* nocapture readonly %c, i64 %idx, i32 %x, i32 %y) #0 {
----------------
I tried to resolve this for now by adding the comment you suggested in both this and the other 2 tests checking the order is preserved.

================
Comment at: test/Transforms/LoadStoreVectorizer/X86/correct-order.ll:17
@@ +16,3 @@
+
+  %l1 = load i32, i32* %next.gep1, align 4
+  %l2 = load i32, i32* %next.gep, align 4
----------------
Removed loops in all tests.

================
Comment at: test/Transforms/LoadStoreVectorizer/X86/correct-order.ll:19
@@ +18,3 @@
+  %l2 = load i32, i32* %next.gep, align 4
+  store i32 0, i32* %next.gep1, align 4
+  store i32 0, i32* %next.gep, align 4
----------------
I'm not sure how to properly create one right now. I added a test that makes an attempt at that, but it in fact ensures there is no vectorization beyond basic blocks (and implicitly through a phi node).


http://reviews.llvm.org/D22071