[llvm] 1c83716 - Revert "[CostModel] remove cost-kind predicate for vector reduction costs"

Sun Oct 25 09:21:36 PDT 2020

Thanks for the test case. Reduced further, and this seems to be an existing
bug - this crashes with "opt -analyze -cost-model costcrash.ll -S"
independently of the cost model change that I made:

target triple = "x86_64"
declare float @llvm.vector.reduce.fadd.v4f32(float, <4 x float>)
define void @c() {
  %r = call fast float @llvm.vector.reduce.fadd.v4f32(float 0.0, <4 x
float> undef)
  ret void
}

On Sun, Oct 25, 2020 at 2:48 AM Martin Storsjö via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

>
> Author: Martin Storsjö
> Date: 2020-10-25T08:47:54+02:00
> New Revision: 1c8371692dfe8245bc6690ff1262dcced4649d21
>
> URL:
> https://github.com/llvm/llvm-project/commit/1c8371692dfe8245bc6690ff1262dcced4649d21
> DIFF:
> https://github.com/llvm/llvm-project/commit/1c8371692dfe8245bc6690ff1262dcced4649d21.diff
>
> LOG: Revert "[CostModel] remove cost-kind predicate for vector reduction
> costs"
>
> This reverts commit 22d10b8ab44f703b72b8316a9b3b8adc623ca73f.
>
> This broke compilation e.g. like this:
> $ cat synth.c
> *a;
> float *b;
> c() {
>   for (;;) {
>     float d = -*b * *a++;
>     d -= *--b * *a++;
>     d -= *--b * *a;
>     d -= *--b * *a;
>     e(d);
>   }
> }
> $ clang -target x86_64-linux-gnu -c -O2 -ffast-math synth.c
> clang: ../include/llvm/Support/Casting.h:104: static bool llvm::isa_impl
> _cl<To, const From*>::doit(const From*) [with To = llvm::PointerType; Fr
> om = llvm::Type]: Assertion `Val && "isa<> used on a null pointer"' fail
> ed.
>
> Added:
>
>
> Modified:
>     llvm/include/llvm/CodeGen/BasicTTIImpl.h
>     llvm/test/Analysis/CostModel/ARM/intrinsic-cost-kinds.ll
>     llvm/test/Analysis/CostModel/ARM/reduce-add.ll
>     llvm/test/Analysis/CostModel/X86/intrinsic-cost-kinds.ll
>
> Removed:
>
>
>
>
> ################################################################################
> diff  --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
> b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
> index 87b70411ef38..c615d48a1021 100644
> --- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
> +++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
> @@ -1202,6 +1202,9 @@ class BasicTTIImplBase : public
> TargetTransformInfoImplCRTPBase<T> {
>      case Intrinsic::vector_reduce_fmin:
>      case Intrinsic::vector_reduce_umax:
>      case Intrinsic::vector_reduce_umin: {
> +      // FIXME: all cost kinds should default to the same thing?
> +      if (CostKind != TTI::TCK_RecipThroughput)
> +        return BaseT::getIntrinsicInstrCost(ICA, CostKind);
>        IntrinsicCostAttributes Attrs(IID, RetTy, Args[0]->getType(), FMF,
> 1, I);
>        return getTypeBasedIntrinsicInstrCost(Attrs, CostKind);
>      }
>
> diff  --git a/llvm/test/Analysis/CostModel/ARM/intrinsic-cost-kinds.ll
> b/llvm/test/Analysis/CostModel/ARM/intrinsic-cost-kinds.ll
> index ea9b6a07f4e4..bffaa98c82aa 100644
> --- a/llvm/test/Analysis/CostModel/ARM/intrinsic-cost-kinds.ll
> +++ b/llvm/test/Analysis/CostModel/ARM/intrinsic-cost-kinds.ll
> @@ -213,11 +213,11 @@ define void @reduce_fmax(<16 x float> %va) {
>  ; LATE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction:
> ret void
>  ;
>  ; SIZE-LABEL: 'reduce_fmax'
> -; SIZE-NEXT:  Cost Model: Found an estimated cost of 620 for instruction:
> %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
> +; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction:
> %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
>  ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction:
> ret void
>  ;
>  ; SIZE_LATE-LABEL: 'reduce_fmax'
> -; SIZE_LATE-NEXT:  Cost Model: Found an estimated cost of 620 for
> instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float>
> %va)
> +; SIZE_LATE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float>
> %va)
>  ; SIZE_LATE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: ret void
>  ;
>    %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
>
> diff  --git a/llvm/test/Analysis/CostModel/ARM/reduce-add.ll
> b/llvm/test/Analysis/CostModel/ARM/reduce-add.ll
> index b3cc0adf7460..2564c1e456c1 100644
> --- a/llvm/test/Analysis/CostModel/ARM/reduce-add.ll
> +++ b/llvm/test/Analysis/CostModel/ARM/reduce-add.ll
> @@ -22,19 +22,19 @@ define i32 @reduce_i64(i32 %arg) {
>  ; NEON-RECIP-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret i32 undef
>  ;
>  ; V8M-SIZE-LABEL: 'reduce_i64'
> -; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
> -; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 7 for
> instruction: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
> -; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 16 for
> instruction: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
> -; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 33 for
> instruction: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
> -; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 66 for
> instruction: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64>
> undef)
> +; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
> +; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
> +; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
> +; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
> +; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64>
> undef)
>  ; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: ret i32 undef
>  ;
>  ; NEON-SIZE-LABEL: 'reduce_i64'
> -; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for
> instruction: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
> -; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 16 for
> instruction: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
> -; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 29 for
> instruction: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
> -; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 54 for
> instruction: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
> -; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 103 for
> instruction: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64>
> undef)
> +; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
> +; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
> +; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
> +; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
> +; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64>
> undef)
>  ; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: ret i32 undef
>  ;
>    %V1  = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
> @@ -67,23 +67,23 @@ define i32 @reduce_i32(i32 %arg) {
>  ; NEON-RECIP-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret i32 undef
>  ;
>  ; V8M-SIZE-LABEL: 'reduce_i32'
> -; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %V2 = call i8 @llvm.vector.reduce.add.v2i8(<2 x i8> undef)
> -; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 9 for
> instruction: %V4 = call i8 @llvm.vector.reduce.add.v4i8(<4 x i8> undef)
> -; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 18 for
> instruction: %V8 = call i8 @llvm.vector.reduce.add.v8i8(<8 x i8> undef)
> -; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 35 for
> instruction: %V16 = call i8 @llvm.vector.reduce.add.v16i8(<16 x i8> undef)
> -; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 68 for
> instruction: %V32 = call i8 @llvm.vector.reduce.add.v32i8(<32 x i8> undef)
> -; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 133 for
> instruction: %V64 = call i8 @llvm.vector.reduce.add.v64i8(<64 x i8> undef)
> -; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 262 for
> instruction: %V128 = call i8 @llvm.vector.reduce.add.v128i8(<128 x i8>
> undef)
> +; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V2 = call i8 @llvm.vector.reduce.add.v2i8(<2 x i8> undef)
> +; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V4 = call i8 @llvm.vector.reduce.add.v4i8(<4 x i8> undef)
> +; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V8 = call i8 @llvm.vector.reduce.add.v8i8(<8 x i8> undef)
> +; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V16 = call i8 @llvm.vector.reduce.add.v16i8(<16 x i8> undef)
> +; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V32 = call i8 @llvm.vector.reduce.add.v32i8(<32 x i8> undef)
> +; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V64 = call i8 @llvm.vector.reduce.add.v64i8(<64 x i8> undef)
> +; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V128 = call i8 @llvm.vector.reduce.add.v128i8(<128 x i8>
> undef)
>  ; V8M-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: ret i32 undef
>  ;
>  ; NEON-SIZE-LABEL: 'reduce_i32'
> -; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 16 for
> instruction: %V2 = call i8 @llvm.vector.reduce.add.v2i8(<2 x i8> undef)
> -; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 53 for
> instruction: %V4 = call i8 @llvm.vector.reduce.add.v4i8(<4 x i8> undef)
> -; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 150 for
> instruction: %V8 = call i8 @llvm.vector.reduce.add.v8i8(<8 x i8> undef)
> -; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 391 for
> instruction: %V16 = call i8 @llvm.vector.reduce.add.v16i8(<16 x i8> undef)
> -; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 488 for
> instruction: %V32 = call i8 @llvm.vector.reduce.add.v32i8(<32 x i8> undef)
> -; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 681 for
> instruction: %V64 = call i8 @llvm.vector.reduce.add.v64i8(<64 x i8> undef)
> -; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1066 for
> instruction: %V128 = call i8 @llvm.vector.reduce.add.v128i8(<128 x i8>
> undef)
> +; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V2 = call i8 @llvm.vector.reduce.add.v2i8(<2 x i8> undef)
> +; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V4 = call i8 @llvm.vector.reduce.add.v4i8(<4 x i8> undef)
> +; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V8 = call i8 @llvm.vector.reduce.add.v8i8(<8 x i8> undef)
> +; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V16 = call i8 @llvm.vector.reduce.add.v16i8(<16 x i8> undef)
> +; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V32 = call i8 @llvm.vector.reduce.add.v32i8(<32 x i8> undef)
> +; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V64 = call i8 @llvm.vector.reduce.add.v64i8(<64 x i8> undef)
> +; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %V128 = call i8 @llvm.vector.reduce.add.v128i8(<128 x i8>
> undef)
>  ; NEON-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: ret i32 undef
>  ;
>    %V2   = call i8 @llvm.vector.reduce.add.v2i8(<2 x i8> undef)
>
> diff  --git a/llvm/test/Analysis/CostModel/X86/intrinsic-cost-kinds.ll
> b/llvm/test/Analysis/CostModel/X86/intrinsic-cost-kinds.ll
> index 1bedeb7c22d1..d3bf703513eb 100644
> --- a/llvm/test/Analysis/CostModel/X86/intrinsic-cost-kinds.ll
> +++ b/llvm/test/Analysis/CostModel/X86/intrinsic-cost-kinds.ll
> @@ -213,11 +213,11 @@ define void @reduce_fmax(<16 x float> %va) {
>  ; LATE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction:
> ret void
>  ;
>  ; SIZE-LABEL: 'reduce_fmax'
> -; SIZE-NEXT:  Cost Model: Found an estimated cost of 7 for instruction:
> %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
> +; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction:
> %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
>  ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction:
> ret void
>  ;
>  ; SIZE_LATE-LABEL: 'reduce_fmax'
> -; SIZE_LATE-NEXT:  Cost Model: Found an estimated cost of 7 for
> instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float>
> %va)
> +; SIZE_LATE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float>
> %va)
>  ; SIZE_LATE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: ret void
>  ;
>    %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20201025/e16f68b1/attachment.html>