[all-commits] [llvm/llvm-project] 671072: [AArch64] Unrolling of loops with vector instructi...

Mon Jul 14 12:53:30 PDT 2025

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 671072e830dace589f3b85d674c356e33330aa9a
      https://github.com/llvm/llvm-project/commit/671072e830dace589f3b85d674c356e33330aa9a
  Author: Ahmad Yasin <ahmad.yasin at apple.com>
  Date:   2025-07-14 (Mon, 14 Jul 2025)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
    A llvm/test/Transforms/LoopUnroll/AArch64/vector.ll

  Log Message:
  -----------
  [AArch64] Unrolling of loops with vector instructions. (#147420)

This patch permits loops with vector instructions to be unrolled.

Today there is an early exit in `getUnrollingPreferences()` of AArch64
targets if a vector instruction is observed in any of the loop blocks.
This patch fixes that so common loops like this one get a chance to be
unrolled:

void saxpy (float * dst, const float * src, const float a, const int
len) {
        float32x4_t * vdst = (float32x4_t *)dst;
        float32x4_t * vsrc = (float32x4_t *)src;
        float32x4_t vk = vdupq_n_f32(a);
        for (int i = 0; i < (len >> 2); i++)
        {
            vdst[i] = vaddq_f32(vdst[i], vmulq_f32(vsrc[i], vk));
        }
    }

Auto-vectorized loops are still not unrolled, unless they were not
interleaved when vectorized.

The provided test case shows the enhancement on top of runtime/partial
unrolling, depending on the CPU.

PR: https://github.com/llvm/llvm-project/pull/147420

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications