[llvm] [SVE][InstCombine] Fold ld1d and splice into ld1ro (PR #69565)

David Sherwood via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 19 05:21:17 PDT 2023


================
@@ -0,0 +1,32 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -S -mattr=+sve,+f64mm -passes=instcombine < %s | FileCheck %s
+
+target triple = "aarch64-unknown-linux-gnu"
+
+define <vscale x 2 x double> @combine_ld1ro_double(<vscale x 2 x i1> %pred, ptr %addr) {
+; CHECK-LABEL: @combine_ld1ro_double(
+; CHECK-NEXT:    [[RES:%.*]] = call <vscale x 2 x double> @llvm.aarch64.sve.ld1ro.nxv2f64(<vscale x 2 x i1> [[PRED:%.*]], ptr [[ADDR:%.*]])
----------------
david-arm wrote:

This transformation doesn't seem to right to me. The ldr1od instruction definition says "Load four contiguous doublewords to elements of a 256-bit (octaword) vector from the memory address". So first you have to prove at compile time that you're loading exactly 256 bits, which requires vscale to be exactly 2. However, even then I am not sure if the transformation is valid because don't you also need to prove the mask used for the load and splice is all ones?

https://github.com/llvm/llvm-project/pull/69565


More information about the llvm-commits mailing list