[all-commits] [llvm/llvm-project] f9c2a3: [SROA] Create additional vector type candidates ba...

Han Zhu via All-commits all-commits at lists.llvm.org
Wed Mar 8 12:01:47 PST 2023


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: f9c2a341b94ca71508dcefa109ece843459f7f13
      https://github.com/llvm/llvm-project/commit/f9c2a341b94ca71508dcefa109ece843459f7f13
  Author: Han Zhu <zhuhan7737 at gmail.com>
  Date:   2023-03-08 (Wed, 08 Mar 2023)

  Changed paths:
    M llvm/lib/Transforms/Scalar/SROA.cpp
    A llvm/test/Transforms/SROA/pr57796.ll
    M llvm/test/Transforms/SROA/sroa-common-type-fail-promotion.ll
    M llvm/test/Transforms/SROA/vector-promotion.ll

  Log Message:
  -----------
  [SROA] Create additional vector type candidates based on store and load slices

Second try at A-Wadhwani's https://reviews.llvm.org/D132096, which was reverted.
The original patch had three issues:
* https://reviews.llvm.org/D134032, which bjope kindly fixed. That patch is merged into this one.
* [GHI #57796](https://github.com/llvm/llvm-project/issues/57796). Fixed and added a test.
* [GHI #57821](https://github.com/llvm/llvm-project/issues/57821). I believe this is an undefined behavior which is not the fault of the original patch. Please see the issue for more details.

Original diff summary:

This patch adds additional vector types to be considered when doing promotion in
SROA, based on the types of the store and load slices. This provides more
promotion opportunities, by potentially using an optimal "intermediate" vector
type.

For example, the following code would currently not be promoted to a vector,
since `__m128i` is a `<2 x i64>` vector.
```

__m128i packfoo0(int a, int b, int c, int d) {
  int r[4] = {a, b, c, d};
  __m128i rm;
  std::memcpy(&rm, r, sizeof(rm));
  return rm;
}
```
```
packfoo0(int, int, int, int):
  mov     dword ptr [rsp - 24], edi
  mov     dword ptr [rsp - 20], esi
  mov     dword ptr [rsp - 16], edx
  mov     dword ptr [rsp - 12], ecx
  movaps  xmm0, xmmword ptr [rsp - 24]
  ret
```
By also considering the types of the elements, we could find that the `<4 x i32>` type would be valid for promotion, hence removing the memory accesses for this function. In other words, we can explore other new vector types, with the same size but different element types based on the load and store instructions from the Slices, which can
provide us more promotion opportunities.

Additionally, the step for removing duplicate elements from the `CandidateTys` vector was not using an equality comparator, which has been fixed.

Differential Revision: https://reviews.llvm.org/D143225




More information about the All-commits mailing list