[PATCH] D21935: Add TLI.allowsMisalignedMemoryAccesses to LoadStoreVectorizer

Wed Jul 6 15:49:38 PDT 2016

jlebar added inline comments.

================
Comment at: include/llvm/Analysis/TargetTransformInfo.h:391
@@ -390,1 +390,3 @@
 
+  /// \brief Indicate whether target allows misaligned memory accesses
+  bool allowsMisalignedMemoryAccesses(unsigned BitWidth, unsigned AddressSpace = 0,
----------------
Maybe a better comment would be

```
Determine whether we can read BitWidth bits from memory with the given alignment (in bytes).
```

That is, this isn't strictly for determining whether *misaligned* memory accesses are allowed -- if Alignment == BitWidth, the call is still allowed (right?).

================
Comment at: lib/Transforms/Vectorize/LoadStoreVectorizer.cpp:703
@@ +702,3 @@
+  // TODO: Remove TargetBaseAlign
+  bool Fast;
+  if (allowsMisaligned(SzInBytes, AS, Alignment, &Fast) == true && Fast &&
----------------
Might as well set this to false, to guard against low-quality TTI overrides.

================
Comment at: lib/Transforms/Vectorize/LoadStoreVectorizer.cpp:704
@@ +703,3 @@
+  bool Fast;
+  if (allowsMisaligned(SzInBytes, AS, Alignment, &Fast) == true && Fast &&
+      (Alignment % SzInBytes) != 0 && (Alignment % TargetBaseAlign) != 0) {
----------------
No `== true`.

================
Comment at: lib/Transforms/Vectorize/LoadStoreVectorizer.cpp:705
@@ -700,1 +704,3 @@
+  if (allowsMisaligned(SzInBytes, AS, Alignment, &Fast) == true && Fast &&
+      (Alignment % SzInBytes) != 0 && (Alignment % TargetBaseAlign) != 0) {
     if (S0->getPointerAddressSpace() == 0) {
----------------
I'm not sure why this is the right condition.  "If the store is going to be misaligned, don't vectorize it."  This is now, essentially, "if the store is going to be misaligned and misaligned stores of this witdh and alignment are not fast, don't vectorize it," right?  So shouldn't the condition be

  `(Alignment % SzInBytes) != 0 && (!allowsMisaligned(...) || !Fast)`

?

In fact, maybe it makes sense to centralize this logic, rename `allowsMisaligned`, and just do

  `memoryAccessIsAllowedAndFast(SzInBytes, AS, Alignment)`

================
Comment at: test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll:505
@@ -504,4 +504,3 @@
 ; CHECK-LABEL: @merge_local_store_2_constants_i32_align_2
-; CHECK: store i32
-; CHECK: store i32
+; CHECK: store <2 x i32> <i32 456, i32 123>, <2 x i32> addrspace(3)* %1, align 2
 define void @merge_local_store_2_constants_i32_align_2(i32 addrspace(3)* %out) #0 {
----------------
I am surprised we can vectorize this...are you sure it's not a bug?


http://reviews.llvm.org/D21935