[PATCH] D48725: [RFC][SLP] Vectorize bit-parallel operations with GPR.

Thu Jun 28 08:12:27 PDT 2018

courbet created this revision.
Herald added a subscriber: llvm-commits.

Consider the following code:

  struct S {
    int32_t a;
    int32_t b;
    int64_t c;
    int32_t d;
  };

  S PartialCopy(const S& s) {
    S result;
    result.a = s.a;
    result.b = s.b;
    return result;
  }

The two load/stores do not vectorize:

  mov eax, dword ptr [rsi]
  mov dword ptr [rdi], eax
  mov eax, dword ptr [rsi + 4]
  mov dword ptr [rdi + 4], eax
  mov rax, rdi
  ret

This is because the SLP vectorizer only considers 4xi32=i128 as a candidate,
because there exists such a vector register. It never considers 2xi32=i64,
because the only register that exists for this is a GPR.
However, all operations that only manipulate values as arrays of
bits (e.g. Load, Store, Bitcast, and potentially Xor/And/Or) do not
strictly require vector registers. Let's call these **bit-parallel**
operations.

This change lets the SLP vectorizer vectorize using the native GPR size.

The example above will vectorize to:

  mov rax, qword ptr [rsi]
  mov qword ptr [rdi], rax
  mov rax, rdi
  ret

Repository:
  rL LLVM

https://reviews.llvm.org/D48725

Files:
  include/llvm/Transforms/Vectorize/SLPVectorizer.h
  lib/Transforms/Vectorize/SLPVectorizer.cpp
  test/Transforms/SLPVectorizer/X86/bit-parallel.ll
  test/Transforms/SLPVectorizer/X86/tiny-tree.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D48725.153325.patch
Type: text/x-patch
Size: 21896 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180628/a2d6ea91/attachment.bin>