[llvm] r359313 - [PowerPC] Update P9 vector costs for insert/extract element

David Jones via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 30 21:57:00 PDT 2019


I am seeing crashes that bisect cleanly to this revision.

This is the crash reported by llc:

$ ./llc testcase-reduced.ll -o /dev/null
Stack dump:
0. Program arguments: ./llc testcase-reduced.ll -o /dev/null
1. Running pass 'Function Pass Manager' on module 'testcase-reduced.ll'.
2. Running pass 'PowerPC DAG->DAG Pattern Instruction Selection' on
function '@f'
0  llc             0x00005599bdc0bb38
1  llc             0x00005599bdc0e3e6
2  libpthread.so.0 0x00007f6114bb49a0
3  llc             0x00005599bd43e595


This looks to me like it may simply be a latent bug exposed by this
revision. It does bisect cleanly to this revision, though, so I'm going to
go ahead and revert as a temporary, stopgap measure.

Below is a fairly reduced test case (use the same command as above ^^^). It
crashes under llc as new as r359636, but works if I revert just this
revision.

I hope that the test base below is enough to make progress. Sorry about the
revert. :-(

--dlj




$ cat testcase-reduced.ll
source_filename = "375b79.cpp"
target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

define void @f() local_unnamed_addr #0 align 32 {
  br i1 undef, label %1, label %23

1:                                                ; preds = %0
  br i1 undef, label %24, label %2

2:                                                ; preds = %1
  br label %3

3:                                                ; preds = %3, %2
  %4 = phi <2 x i64> [ zeroinitializer, %2 ], [ %17, %3 ]
  %5 = phi <2 x i64> [ zeroinitializer, %2 ], [ %18, %3 ]
  %6 = load <2 x i32>, <2 x i32>* undef, align 4
  %7 = ashr <2 x i32> %6, <i32 31, i32 31>
  %8 = xor <2 x i32> zeroinitializer, %7
  %9 = or <2 x i32> %8, <i32 1, i32 1>
  %10 = call <2 x i32> @llvm.ctlz.v2i32(<2 x i32> %9, i1 true)
  %11 = xor <2 x i32> %10, <i32 31, i32 31>
  %12 = mul nuw nsw <2 x i32> %11, <i32 9, i32 9>
  %13 = add nuw nsw <2 x i32> %12, <i32 73, i32 73>
  %14 = lshr <2 x i32> %13, <i32 6, i32 6>
  %15 = zext <2 x i32> undef to <2 x i64>
  %16 = zext <2 x i32> %14 to <2 x i64>
  %17 = add <2 x i64> %4, %15
  %18 = add <2 x i64> %5, %16
  %19 = add i64 0, 24
  br i1 false, label %20, label %3, !llvm.loop !1

20:                                               ; preds = %3
  %21 = add <2 x i64> %18, %17
  %22 = add <2 x i64> undef, %21
  br i1 undef, label %23, label %24

23:                                               ; preds = %24, %20, %0
  ret void

24:                                               ; preds = %24, %20, %1
  br i1 undef, label %23, label %24, !llvm.loop !3
}

; Function Attrs: nounwind readnone speculatable
declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32>, i1 immarg) #1

attributes #0 = { "use-soft-float"="false" }
attributes #1 = { nounwind readnone speculatable }

!llvm.ident = !{!0}

!0 = !{!"clang version eleventeen (trunk)"}
!1 = distinct !{!1, !2}
!2 = !{!"llvm.loop.isvectorized", i32 1}
!3 = distinct !{!3, !4, !2}
!4 = !{!"llvm.loop.unroll.runtime.disable"}






On Fri, Apr 26, 2019 at 9:12 AM Roland Froese via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> Author: froese
> Date: Fri Apr 26 09:14:17 2019
> New Revision: 359313
>
> URL: http://llvm.org/viewvc/llvm-project?rev=359313&view=rev
> Log:
> [PowerPC] Update P9 vector costs for insert/extract element
>
> The PPC vector cost model values for insert/extract element reflect older
> processors that lacked vector insert/extract and move-to/move-from VSR
> instructions.  Update getVectorInstrCost to give appropriate values for
> when
> the newer instructions are present.
>
> Differential Revision: https://reviews.llvm.org/D60160
>
> Modified:
>     llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
>     llvm/trunk/test/Analysis/CostModel/PowerPC/insert_extract.ll
>
> Modified: llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp?rev=359313&r1=359312&r2=359313&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp (original)
> +++ llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp Fri Apr 26
> 09:14:17 2019
> @@ -412,6 +412,35 @@ int PPCTTIImpl::getVectorInstrCost(unsig
>        return 0;
>
>      return Cost;
> +
> +  } else if (Val->getScalarType()->isIntegerTy() && Index != -1U) {
> +    if (ST->hasP9Altivec()) {
> +      if (ISD == ISD::INSERT_VECTOR_ELT)
> +        // A move-to VSR and a permute/insert.  Assume vector operation
> cost
> +        // for both (cost will be 2x on P9).
> +        return vectorCostAdjustment(2, Opcode, Val, nullptr);
> +
> +      // It's an extract.  Maybe we can do a cheap move-from VSR.
> +      unsigned EltSize = Val->getScalarSizeInBits();
> +      if (EltSize == 64) {
> +        unsigned MfvsrdIndex = ST->isLittleEndian() ? 1 : 0;
> +        if (Index == MfvsrdIndex)
> +          return 1;
> +      } else if (EltSize == 32) {
> +        unsigned MfvsrwzIndex = ST->isLittleEndian() ? 2 : 1;
> +        if (Index == MfvsrwzIndex)
> +          return 1;
> +      }
> +
> +      // We need a vector extract (or mfvsrld).  Assume vector operation
> cost.
> +      // The cost of the load constant for a vector extract is disregarded
> +      // (invariant, easily schedulable).
> +      return vectorCostAdjustment(1, Opcode, Val, nullptr);
> +
> +    } else if (ST->hasDirectMove())
> +      // Assume permute has standard cost.
> +      // Assume move-to/move-from VSR have 2x standard cost.
> +      return 3;
>    }
>
>    // Estimated cost of a load-hit-store delay.  This was obtained
>
> Modified: llvm/trunk/test/Analysis/CostModel/PowerPC/insert_extract.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/CostModel/PowerPC/insert_extract.ll?rev=359313&r1=359312&r2=359313&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/Analysis/CostModel/PowerPC/insert_extract.ll (original)
> +++ llvm/trunk/test/Analysis/CostModel/PowerPC/insert_extract.ll Fri Apr
> 26 09:14:17 2019
> @@ -14,15 +14,15 @@ define i32 @insert(i32 %arg) {
>  ; CHECK-P7-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret i32 undef
>  ;
>  ; CHECK-P8LE-LABEL: 'insert'
> -; CHECK-P8LE-NEXT:  Cost Model: Found an estimated cost of 10 for
> instruction: %x = insertelement <4 x i32> undef, i32 %arg, i32 0
> +; CHECK-P8LE-NEXT:  Cost Model: Found an estimated cost of 3 for
> instruction: %x = insertelement <4 x i32> undef, i32 %arg, i32 0
>  ; CHECK-P8LE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret i32 undef
>  ;
>  ; CHECK-P9BE-LABEL: 'insert'
> -; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 11 for
> instruction: %x = insertelement <4 x i32> undef, i32 %arg, i32 0
> +; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %x = insertelement <4 x i32> undef, i32 %arg, i32 0
>  ; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret i32 undef
>  ;
>  ; CHECK-P9LE-LABEL: 'insert'
> -; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 11 for
> instruction: %x = insertelement <4 x i32> undef, i32 %arg, i32 0
> +; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %x = insertelement <4 x i32> undef, i32 %arg, i32 0
>  ; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret i32 undef
>  ;
>    %x = insertelement <4 x i32> undef, i32 %arg, i32 0
> @@ -40,11 +40,11 @@ define i32 @extract(<4 x i32> %arg) {
>  ; CHECK-P8LE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret i32 %x
>  ;
>  ; CHECK-P9BE-LABEL: 'extract'
> -; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %x = extractelement <4 x i32> %arg, i32 0
> +; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %x = extractelement <4 x i32> %arg, i32 0
>  ; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret i32 %x
>  ;
>  ; CHECK-P9LE-LABEL: 'extract'
> -; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %x = extractelement <4 x i32> %arg, i32 0
> +; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %x = extractelement <4 x i32> %arg, i32 0
>  ; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret i32 %x
>  ;
>    %x = extractelement <4 x i32> %arg, i32 0
> @@ -83,15 +83,15 @@ define void @test4xi32(<4 x i32> %v1, i3
>  ; CHECK-P7-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret void
>  ;
>  ; CHECK-P8LE-LABEL: 'test4xi32'
> -; CHECK-P8LE-NEXT:  Cost Model: Found an estimated cost of 10 for
> instruction: %v2 = insertelement <4 x i32> %v1, i32 %x1, i32 2
> +; CHECK-P8LE-NEXT:  Cost Model: Found an estimated cost of 3 for
> instruction: %v2 = insertelement <4 x i32> %v1, i32 %x1, i32 2
>  ; CHECK-P8LE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret void
>  ;
>  ; CHECK-P9BE-LABEL: 'test4xi32'
> -; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 11 for
> instruction: %v2 = insertelement <4 x i32> %v1, i32 %x1, i32 2
> +; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %v2 = insertelement <4 x i32> %v1, i32 %x1, i32 2
>  ; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret void
>  ;
>  ; CHECK-P9LE-LABEL: 'test4xi32'
> -; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 11 for
> instruction: %v2 = insertelement <4 x i32> %v1, i32 %x1, i32 2
> +; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %v2 = insertelement <4 x i32> %v1, i32 %x1, i32 2
>  ; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret void
>  ;
>    %v2 = insertelement <4 x i32> %v1, i32 %x1, i32 2
> @@ -114,17 +114,17 @@ define void @vexti32(<4 x i32> %p1) {
>  ; CHECK-P8LE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret void
>  ;
>  ; CHECK-P9BE-LABEL: 'vexti32'
> -; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i1 = extractelement <4 x i32> %p1, i32 0
> -; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i2 = extractelement <4 x i32> %p1, i32 1
> -; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i3 = extractelement <4 x i32> %p1, i32 2
> -; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i4 = extractelement <4 x i32> %p1, i32 3
> +; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %i1 = extractelement <4 x i32> %p1, i32 0
> +; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %i2 = extractelement <4 x i32> %p1, i32 1
> +; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %i3 = extractelement <4 x i32> %p1, i32 2
> +; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %i4 = extractelement <4 x i32> %p1, i32 3
>  ; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret void
>  ;
>  ; CHECK-P9LE-LABEL: 'vexti32'
> -; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i1 = extractelement <4 x i32> %p1, i32 0
> -; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i2 = extractelement <4 x i32> %p1, i32 1
> -; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i3 = extractelement <4 x i32> %p1, i32 2
> -; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i4 = extractelement <4 x i32> %p1, i32 3
> +; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %i1 = extractelement <4 x i32> %p1, i32 0
> +; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %i2 = extractelement <4 x i32> %p1, i32 1
> +; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %i3 = extractelement <4 x i32> %p1, i32 2
> +; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %i4 = extractelement <4 x i32> %p1, i32 3
>  ; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret void
>  ;
>    %i1 = extractelement <4 x i32> %p1, i32 0
> @@ -146,13 +146,13 @@ define void @vexti64(<2 x i64> %p1) {
>  ; CHECK-P8LE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret void
>  ;
>  ; CHECK-P9BE-LABEL: 'vexti64'
> -; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i1 = extractelement <2 x i64> %p1, i32 0
> -; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i2 = extractelement <2 x i64> %p1, i32 1
> +; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %i1 = extractelement <2 x i64> %p1, i32 0
> +; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %i2 = extractelement <2 x i64> %p1, i32 1
>  ; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret void
>  ;
>  ; CHECK-P9LE-LABEL: 'vexti64'
> -; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i1 = extractelement <2 x i64> %p1, i32 0
> -; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i2 = extractelement <2 x i64> %p1, i32 1
> +; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %i1 = extractelement <2 x i64> %p1, i32 0
> +; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 1 for
> instruction: %i2 = extractelement <2 x i64> %p1, i32 1
>  ; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret void
>  ;
>    %i1 = extractelement <2 x i64> %p1, i32 0
> @@ -172,13 +172,13 @@ define void @vext(<8 x i16> %p1, <16 x i
>  ; CHECK-P8LE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret void
>  ;
>  ; CHECK-P9BE-LABEL: 'vext'
> -; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i1 = extractelement <8 x i16> %p1, i32 0
> -; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i2 = extractelement <16 x i8> %p2, i32 0
> +; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %i1 = extractelement <8 x i16> %p1, i32 0
> +; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %i2 = extractelement <16 x i8> %p2, i32 0
>  ; CHECK-P9BE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret void
>  ;
>  ; CHECK-P9LE-LABEL: 'vext'
> -; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i1 = extractelement <8 x i16> %p1, i32 0
> -; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 4 for
> instruction: %i2 = extractelement <16 x i8> %p2, i32 0
> +; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %i1 = extractelement <8 x i16> %p1, i32 0
> +; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 2 for
> instruction: %i2 = extractelement <16 x i8> %p2, i32 0
>  ; CHECK-P9LE-NEXT:  Cost Model: Found an estimated cost of 0 for
> instruction: ret void
>  ;
>    %i1 = extractelement <8 x i16> %p1, i32 0
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190430/ca8b2969/attachment.html>


More information about the llvm-commits mailing list