[llvm] [SLP] no need to generate extract for in-tree uses for original scala… (PR #76077)

Thu Dec 21 18:33:08 PST 2023

================
@@ -11898,6 +11864,12 @@ Value *BoUpSLP::vectorizeTree(
     Value *Vec = E->VectorizedValue;
     assert(Vec && "Can't find vectorizable value");
 
+    // Generate extract for in-tree uses if the use is scalar operand in
+    // vectorized instruction.
+    if (auto *UserVecTE = getTreeEntry(User))
+      if (doesInTreeUserNeedToExtract(Scalar, cast<Instruction>(User), TLI))
+        User = cast<llvm::User>(UserVecTE->VectorizedValue);
+
----------------
Enna1 wrote:

> Why need this?

Adding new vectorized load/store/call that use in-tree scalar to `ExternalUses` in `BoUpSLP::vectorizeTree()` is removed, we still should generate extract for these new added vectorized load/store/call. As we add the original scalar load/store/call to `ExternalUses` in `buildingExtractUsers()`, so I add a check here, generate extract for new added vectorized load/store/call instead of the scalar load/store/call that will be erased later.

Take a llvm/test/Transforms/SLPVectorizer/X86/extract_in_tree_user.ll as an example.
```
define i32 @fn1() {
entry:
  %0 = load ptr, ptr @a, align 8
  %add.ptr = getelementptr inbounds i64, ptr %0, i64 11
  %1 = ptrtoint ptr %add.ptr to i64
  store i64 %1, ptr %add.ptr, align 8
  %add.ptr1 = getelementptr inbounds i64, ptr %0, i64 56
  %2 = ptrtoint ptr %add.ptr1 to i64
  %arrayidx2 = getelementptr inbounds i64, ptr %0, i64 12
  store i64 %2, ptr %arrayidx2, align 8
  ; store <2 x i64> %5, ptr %add.ptr, align 8
  ret i32 undef
}
```
Without this change,  when generating extracts, `ExternalUses` have
 `{%add.ptr, store i64 %1, ptr %add.ptr, align 8}` and
  `{%add.ptr, store <2 x i64> %5, ptr %add.ptr, align 8}`, 
what this patch want to do is only generate extract for `{%add.ptr, store <2 x i64> %5, ptr %add.ptr, align 8}` since there is no need to generate extract for `{%add.ptr, store i64 %1, ptr %add.ptr, align 8}`.

---

> I meant try to fix the place where we buildingExtractUsers directly, I think it can be fixed by proper checking of the scalar element.

Sorry, I don't quite understand this, could you please explain this a bit more?

https://github.com/llvm/llvm-project/pull/76077