[PATCH] D145163: Add support for vectorization of interleaved memory accesses for scalable VF
Philip Reames via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 10 07:35:37 PST 2023
reames added inline comments.
================
Comment at: llvm/include/llvm/IR/IRBuilder.h:771
+ /// Create a masked interleaved load using a masked load and deinterliving
+ /// intrinsics.
----------------
This is the wrong interface. The IRBuilder interface should provide a way to create the interleave and deinterleave instrinsic calls. That interface should generate shuffles for fixed vectors. Then the calling logic in the vectorizer should worry about emitting the load/store. (That's the existing structure in fact.)
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2987
Value *InnerLoopVectorizer::createBitOrPointerCast(Value *V, VectorType *DstVTy,
const DataLayout &DL) {
----------------
The changes to this function are NFC for fixed length vectors, and a generally useful scalable cleanup. Please separate and land this change without the need for further review.
This applies *only* to the changes in this function so as to shrink the diff for future review.
================
Comment at: llvm/test/Transforms/LoopVectorize/sve-interleaved-accesses.ll:1
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -mtriple=aarch64-none-linux-gnu -S -passes=loop-vectorize,instcombine -force-vector-width=4 -force-vector-interleave=1 -enable-interleaved-mem-accesses=true -enable-sve-interleaved-mem-accesses=true -mattr=+sve -scalable-vectorization=on -runtime-memory-check-threshold=24 < %s | FileCheck %s
----------------
This should be in the AArch64 sub-tree, and probably precommited. Depending on your confidence in the AArch64 code, you may want to separate that into it's own review.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D145163/new/
https://reviews.llvm.org/D145163
More information about the llvm-commits
mailing list