[llvm] [NVPTX] Optimize v16i8 reductions (PR #67322)

Pierre-Andre Saulais via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 26 09:46:47 PDT 2023


================
@@ -52,3 +52,129 @@ define float @ff(ptr %p) {
   %sum = fadd float %sum3, %v4
   ret float %sum
 }
+
+define void @combine_v16i8(ptr noundef align 16 %ptr1, ptr noundef align 16 %ptr2) {
+  ; ENABLED-LABEL: combine_v16i8
+  ; ENABLED: ld.v4.u32
+  ; ENABLED: st.u32
----------------
pasaulais wrote:

That's true, that test is not checking the lowering of the vector extractions, only that the scalar loads are vectorized into a single vector load. That checking is done in `llvm/test/CodeGen/NVPTX/v4i8-operations.ll` (though additions are not checked there either). Are the `LoadStoreVectorizer.ll` tests better left out of this PR?

https://github.com/llvm/llvm-project/pull/67322


More information about the llvm-commits mailing list