[all-commits] [llvm/llvm-project] f575b1: [LV] Add support for partial reductions without a ...

David Sherwood via All-commits all-commits at lists.llvm.org
Wed Jul 2 05:06:12 PDT 2025


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: f575b18fdc8359bddc8747dbb8b16e5d10705dda
      https://github.com/llvm/llvm-project/commit/f575b18fdc8359bddc8747dbb8b16e5d10705dda
  Author: David Sherwood <david.sherwood at arm.com>
  Date:   2025-07-02 (Wed, 02 Jul 2025)

  Changed paths:
    M llvm/include/llvm/Analysis/TargetTransformInfo.h
    M llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
    M llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
    M llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
    M llvm/test/CodeGen/AArch64/neon-partial-reduce-dot-product.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-chained.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-dot-product-mixed.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-dot-product.ll
    A llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce.ll

  Log Message:
  -----------
  [LV] Add support for partial reductions without a binary op (#133922)

Consider IR such as this:

for.body:
  %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
  %accum = phi i32 [ 0, %entry ], [ %add, %for.body ]
  %gep.a = getelementptr i8, ptr %a, i64 %iv
  %load.a = load i8, ptr %gep.a, align 1
  %ext.a = zext i8 %load.a to i32
  %add = add i32 %ext.a, %accum
  %iv.next = add i64 %iv, 1
  %exitcond.not = icmp eq i64 %iv.next, 1025
  br i1 %exitcond.not, label %for.exit, label %for.body

Conceptually we can vectorise this using partial reductions too,
although the current loop vectoriser implementation requires the
accumulation of a multiply. For AArch64 this is easily done with
a udot or sdot with an identity operand, i.e. a vector of (i16 1).

In order to do this I had to teach getScaledReductions that the
accumulated value may come from a unary op, hence there is only
one extension to consider. Similarly, I updated the vplan and
AArch64 TTI cost model to understand the possible unary op.

---------

Co-authored-by: Matt Devereau <matthew.devereau at arm.com>



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list