<div dir="ltr">Update LangRef so it doesn't say the second type must be an i1?<div><br clear="all"><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">~Craig</div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr">On Wed, Jan 23, 2019 at 8:00 AM Simon Pilgrim via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Author: rksimon<br>

Date: Wed Jan 23 08:00:22 2019<br>

New Revision: 351957<br>

<br>

URL: <a href="http://llvm.org/viewvc/llvm-project?rev=351957&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=351957&view=rev</a><br>

Log:<br>

[IR] Match intrinsic parameter by scalar/vectorwidth<br>

<br>

This patch replaces the existing LLVMVectorSameWidth matcher with LLVMScalarOrSameVectorWidth.<br>

<br>

The matching args must be either scalars or vectors with the same number of elements, but in either case the scalar/element type can differ, specified by LLVMScalarOrSameVectorWidth.<br>

<br>

I've updated the _overflow intrinsics to demonstrate this - allowing it to return a i1 or <N x i1> overflow result, matching the scalar/vectorwidth of the other (add/sub/mul) result type.<br>

<br>

The masked load/store/gather/scatter intrinsics have also been updated to use this, although as we specify the reference type to be llvm_anyvector_ty we guarantee the mask will be <N x i1> so no change in behaviour<br>

<br>

Differential Revision: <a href="https://reviews.llvm.org/D57090" rel="noreferrer" target="_blank">https://reviews.llvm.org/D57090</a><br>

<br>

Added:<br>

    llvm/trunk/test/Analysis/CostModel/X86/arith-overflow.ll<br>

Modified:<br>

    llvm/trunk/include/llvm/IR/Intrinsics.td<br>

    llvm/trunk/lib/IR/Function.cpp<br>

    llvm/trunk/utils/TableGen/CodeGenTarget.cpp<br>

    llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp<br>

<br>

Modified: llvm/trunk/include/llvm/IR/Intrinsics.td<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Intrinsics.td?rev=351957&r1=351956&r2=351957&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Intrinsics.td?rev=351957&r1=351956&r2=351957&view=diff</a><br>

==============================================================================<br>

--- llvm/trunk/include/llvm/IR/Intrinsics.td (original)<br>

+++ llvm/trunk/include/llvm/IR/Intrinsics.td Wed Jan 23 08:00:22 2019<br>

@@ -156,10 +156,15 @@ class LLVMMatchType<int num><br>

 // the intrinsic is overloaded, so the matched type should be declared as iAny.<br>

 class LLVMExtendedType<int num> : LLVMMatchType<num>;<br>

 class LLVMTruncatedType<int num> : LLVMMatchType<num>;<br>

-class LLVMVectorSameWidth<int num, LLVMType elty><br>

-  : LLVMMatchType<num> {<br>

+<br>

+// Match the scalar/vector of another intrinsic parameter but with a different<br>

+// element type. Either both are scalars or both are vectors with the same<br>

+// number of elements.<br>

+class LLVMScalarOrSameVectorWidth<int idx, LLVMType elty><br>

+  : LLVMMatchType<idx> {<br>

   ValueType ElTy = elty.VT;<br>

 }<br>

+<br>

 class LLVMPointerTo<int num> : LLVMMatchType<num>;<br>

 class LLVMPointerToElt<int num> : LLVMMatchType<num>;<br>

 class LLVMVectorOfAnyPointersToElt<int num> : LLVMMatchType<num>;<br>

@@ -796,24 +801,30 @@ def int_adjust_trampoline : Intrinsic<[l<br>

 //<br>

<br>

 // Expose the carry flag from add operations on two integrals.<br>

-def int_sadd_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty],<br>

+def int_sadd_with_overflow : Intrinsic<[llvm_anyint_ty,<br>

+                                        LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>],<br>

                                        [LLVMMatchType<0>, LLVMMatchType<0>],<br>

                                        [IntrNoMem, IntrSpeculatable]>;<br>

-def int_uadd_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty],<br>

+def int_uadd_with_overflow : Intrinsic<[llvm_anyint_ty,<br>

+                                        LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>],<br>

                                        [LLVMMatchType<0>, LLVMMatchType<0>],<br>

                                        [IntrNoMem, IntrSpeculatable]>;<br>

<br>

-def int_ssub_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty],<br>

+def int_ssub_with_overflow : Intrinsic<[llvm_anyint_ty,<br>

+                                        LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>],<br>

                                        [LLVMMatchType<0>, LLVMMatchType<0>],<br>

                                        [IntrNoMem, IntrSpeculatable]>;<br>

-def int_usub_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty],<br>

+def int_usub_with_overflow : Intrinsic<[llvm_anyint_ty,<br>

+                                        LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>],<br>

                                        [LLVMMatchType<0>, LLVMMatchType<0>],<br>

                                        [IntrNoMem, IntrSpeculatable]>;<br>

<br>

-def int_smul_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty],<br>

+def int_smul_with_overflow : Intrinsic<[llvm_anyint_ty,<br>

+                                        LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>],<br>

                                        [LLVMMatchType<0>, LLVMMatchType<0>],<br>

                                        [IntrNoMem, IntrSpeculatable]>;<br>

-def int_umul_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty],<br>

+def int_umul_with_overflow : Intrinsic<[llvm_anyint_ty,<br>

+                                        LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>],<br>

                                        [LLVMMatchType<0>, LLVMMatchType<0>],<br>

                                        [IntrNoMem, IntrSpeculatable]>;<br>

<br>

@@ -1001,35 +1012,35 @@ def int_is_constant : Intrinsic<[llvm_i1<br>

 def int_masked_store : Intrinsic<[], [llvm_anyvector_ty,<br>

                                       LLVMAnyPointerType<LLVMMatchType<0>>,<br>

                                       llvm_i32_ty,<br>

-                                      LLVMVectorSameWidth<0, llvm_i1_ty>],<br>

+                                      LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>],<br>

                                  [IntrArgMemOnly]>;<br>

<br>

 def int_masked_load  : Intrinsic<[llvm_anyvector_ty],<br>

                                  [LLVMAnyPointerType<LLVMMatchType<0>>, llvm_i32_ty,<br>

-                                  LLVMVectorSameWidth<0, llvm_i1_ty>, LLVMMatchType<0>],<br>

+                                  LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>, LLVMMatchType<0>],<br>

                                  [IntrReadMem, IntrArgMemOnly]>;<br>

<br>

 def int_masked_gather: Intrinsic<[llvm_anyvector_ty],<br>

                                  [LLVMVectorOfAnyPointersToElt<0>, llvm_i32_ty,<br>

-                                  LLVMVectorSameWidth<0, llvm_i1_ty>,<br>

+                                  LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,<br>

                                   LLVMMatchType<0>],<br>

                                  [IntrReadMem]>;<br>

<br>

 def int_masked_scatter: Intrinsic<[],<br>

                                   [llvm_anyvector_ty,<br>

                                    LLVMVectorOfAnyPointersToElt<0>, llvm_i32_ty,<br>

-                                   LLVMVectorSameWidth<0, llvm_i1_ty>]>;<br>

+                                   LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>]>;<br>

<br>

 def int_masked_expandload: Intrinsic<[llvm_anyvector_ty],<br>

                                      [LLVMPointerToElt<0>,<br>

-                                      LLVMVectorSameWidth<0, llvm_i1_ty>,<br>

+                                      LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,<br>

                                       LLVMMatchType<0>],<br>

                                      [IntrReadMem]>;<br>

<br>

 def int_masked_compressstore: Intrinsic<[],<br>

                                      [llvm_anyvector_ty,<br>

                                       LLVMPointerToElt<0>,<br>

-                                      LLVMVectorSameWidth<0, llvm_i1_ty>],<br>

+                                      LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>],<br>

                                      [IntrArgMemOnly]>;<br>

<br>

 // Test whether a pointer is associated with a type metadata identifier.<br>

<br>

Modified: llvm/trunk/lib/IR/Function.cpp<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/Function.cpp?rev=351957&r1=351956&r2=351957&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/Function.cpp?rev=351957&r1=351956&r2=351957&view=diff</a><br>

==============================================================================<br>

--- llvm/trunk/lib/IR/Function.cpp (original)<br>

+++ llvm/trunk/lib/IR/Function.cpp Wed Jan 23 08:00:22 2019<br>

@@ -948,10 +948,9 @@ static Type *DecodeFixedType(ArrayRef<In<br>

   case IITDescriptor::SameVecWidthArgument: {<br>

     Type *EltTy = DecodeFixedType(Infos, Tys, Context);<br>

     Type *Ty = Tys[D.getArgumentNumber()];<br>

-    if (VectorType *VTy = dyn_cast<VectorType>(Ty)) {<br>

+    if (auto *VTy = dyn_cast<VectorType>(Ty))<br>

       return VectorType::get(EltTy, VTy->getNumElements());<br>

-    }<br>

-    llvm_unreachable("unhandled");<br>

+    return EltTy;<br>

   }<br>

   case IITDescriptor::PtrToArgument: {<br>

     Type *Ty = Tys[D.getArgumentNumber()];<br>

@@ -1135,15 +1134,19 @@ bool Intrinsic::matchIntrinsicType(Type<br>

     case IITDescriptor::SameVecWidthArgument: {<br>

       if (D.getArgumentNumber() >= ArgTys.size())<br>

         return true;<br>

-      VectorType * ReferenceType =<br>

-        dyn_cast<VectorType>(ArgTys[D.getArgumentNumber()]);<br>

-      VectorType *ThisArgType = dyn_cast<VectorType>(Ty);<br>

-      if (!ThisArgType || !ReferenceType ||<br>

-          (ReferenceType->getVectorNumElements() !=<br>

-           ThisArgType->getVectorNumElements()))<br>

-        return true;<br>

-      return matchIntrinsicType(ThisArgType->getVectorElementType(),<br>

-                                Infos, ArgTys);<br>

+      auto *ReferenceType = dyn_cast<VectorType>(ArgTys[D.getArgumentNumber()]);<br>

+      auto *ThisArgType = dyn_cast<VectorType>(Ty);<br>

+      // Both must be vectors of the same number of elements or neither.<br>

+      if ((ReferenceType != nullptr) != (ThisArgType != nullptr))<br>

+        return true;<br>

+      Type *EltTy = Ty;<br>

+      if (ThisArgType) {<br>

+        if (ReferenceType->getVectorNumElements() !=<br>

+            ThisArgType->getVectorNumElements())<br>

+          return true;<br>

+        EltTy = ThisArgType->getVectorElementType();<br>

+      }<br>

+      return matchIntrinsicType(EltTy, Infos, ArgTys);<br>

     }<br>

     case IITDescriptor::PtrToArgument: {<br>

       if (D.getArgumentNumber() >= ArgTys.size())<br>

<br>

Added: llvm/trunk/test/Analysis/CostModel/X86/arith-overflow.ll<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/CostModel/X86/arith-overflow.ll?rev=351957&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/CostModel/X86/arith-overflow.ll?rev=351957&view=auto</a><br>

==============================================================================<br>

--- llvm/trunk/test/Analysis/CostModel/X86/arith-overflow.ll (added)<br>

+++ llvm/trunk/test/Analysis/CostModel/X86/arith-overflow.ll Wed Jan 23 08:00:22 2019<br>

@@ -0,0 +1,414 @@<br>

+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py<br>

+; RUN: opt < %s -cost-model -analyze -mtriple=x86_64-apple-macosx10.8.0 -mattr=+ssse3 | FileCheck %s --check-prefixes=CHECK,SSE,SSSE3<br>

+; RUN: opt < %s -cost-model -analyze -mtriple=x86_64-apple-macosx10.8.0 -mattr=+sse4.2 | FileCheck %s --check-prefixes=CHECK,SSE,SSE42<br>

+; RUN: opt < %s -cost-model -analyze -mtriple=x86_64-apple-macosx10.8.0 -mattr=+avx | FileCheck %s --check-prefixes=CHECK,AVX,AVX1<br>

+; RUN: opt < %s -cost-model -analyze -mtriple=x86_64-apple-macosx10.8.0 -mattr=+avx2 | FileCheck %s --check-prefixes=CHECK,AVX,AVX2<br>

+; RUN: opt < %s -cost-model -analyze -mtriple=x86_64-apple-macosx10.8.0 -mattr=+avx512f | FileCheck %s --check-prefixes=CHECK,AVX512,AVX512F<br>

+; RUN: opt < %s -cost-model -analyze -mtriple=x86_64-apple-macosx10.8.0 -mattr=+avx512f,+avx512bw | FileCheck %s --check-prefixes=CHECK,AVX512,AVX512BW<br>

+; RUN: opt < %s -cost-model -analyze -mtriple=x86_64-apple-macosx10.8.0 -mattr=+avx512f,+avx512dq | FileCheck %s --check-prefixes=CHECK,AVX512,AVX512DQ<br>

+;<br>

+; RUN: opt < %s -cost-model -analyze -mtriple=x86_64-apple-macosx10.8.0 -mcpu=slm | FileCheck %s --check-prefixes=CHECK,SLM<br>

+; RUN: opt < %s -cost-model -analyze -mtriple=x86_64-apple-macosx10.8.0 -mcpu=goldmont | FileCheck %s --check-prefixes=CHECK,GLM<br>

+; RUN: opt < %s -cost-model -analyze -mtriple=x86_64-apple-macosx10.8.0 -mcpu=btver2 | FileCheck %s --check-prefixes=CHECK,BTVER2<br>

+<br>

+;<br>

+; sadd.with.overflow<br>

+;<br>

+<br>

+declare {i64, i1}              @llvm.sadd.with.overflow.i64(i64, i64)<br>

+declare {<2 x i64>, <2 x i1>}  @llvm.sadd.with.overflow.v2i64(<2 x i64>, <2 x i64>)<br>

+declare {<4 x i64>, <4 x i1>}  @llvm.sadd.with.overflow.v4i64(<4 x i64>, <4 x i64>)<br>

+declare {<8 x i64>, <8 x i1>}  @llvm.sadd.with.overflow.v8i64(<8 x i64>, <8 x i64>)<br>

+<br>

+declare {i32, i1}               @llvm.sadd.with.overflow.i32(i32, i32)<br>

+declare {<4 x i32>, <4 x i1>}   @llvm.sadd.with.overflow.v4i32(<4 x i32>, <4 x i32>)<br>

+declare {<8 x i32>, <8 x i1>}   @llvm.sadd.with.overflow.v8i32(<8 x i32>, <8 x i32>)<br>

+declare {<16 x i32>, <16 x i1>} @llvm.sadd.with.overflow.v16i32(<16 x i32>, <16 x i32>)<br>

+<br>

+declare {i16, i1}               @llvm.sadd.with.overflow.i16(i16, i16)<br>

+declare {<8 x i16>,  <8 x i1>}  @llvm.sadd.with.overflow.v8i16(<8 x i16>, <8 x i16>)<br>

+declare {<16 x i16>, <16 x i1>} @llvm.sadd.with.overflow.v16i16(<16 x i16>, <16 x i16>)<br>

+declare {<32 x i16>, <32 x i1>} @llvm.sadd.with.overflow.v32i16(<32 x i16>, <32 x i16>)<br>

+<br>

+declare {i8, i1}                @llvm.sadd.with.overflow.i8(i8, i8)<br>

+declare {<16 x i8>, <16 x i1>}  @llvm.sadd.with.overflow.v16i8(<16 x i8>, <16 x i8>)<br>

+declare {<32 x i8>, <32 x i1>}  @llvm.sadd.with.overflow.v32i8(<32 x i8>, <32 x i8>)<br>

+declare {<64 x i8>, <64 x i1>}  @llvm.sadd.with.overflow.v64i8(<64 x i8>, <64 x i8>)<br>

+<br>

+define i32 @sadd(i32 %arg) {<br>

+; CHECK-LABEL: 'sadd'<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 undef, i64 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V2I64 = call { <2 x i64>, <2 x i1> } @llvm.sadd.with.overflow.v2i64(<2 x i64> undef, <2 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %V4I64 = call { <4 x i64>, <4 x i1> } @llvm.sadd.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I64 = call { <8 x i64>, <8 x i1> } @llvm.sadd.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 undef, i32 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %V4I32 = call { <4 x i32>, <4 x i1> } @llvm.sadd.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I32 = call { <8 x i32>, <8 x i1> } @llvm.sadd.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I32 = call { <16 x i32>, <16 x i1> } @llvm.sadd.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = call { i16, i1 } @llvm.sadd.with.overflow.i16(i16 undef, i16 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I16 = call { <8 x i16>, <8 x i1> } @llvm.sadd.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I16 = call { <16 x i16>, <16 x i1> } @llvm.sadd.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 95 for instruction: %V32I16 = call { <32 x i16>, <32 x i1> } @llvm.sadd.with.overflow.v32i16(<32 x i16> undef, <32 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = call { i8, i1 } @llvm.sadd.with.overflow.i8(i8 undef, i8 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I8 = call { <16 x i8>, <16 x i1> } @llvm.sadd.with.overflow.v16i8(<16 x i8> undef, <16 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 95 for instruction: %V32I8 = call { <32 x i8>, <32 x i1> } @llvm.sadd.with.overflow.v32i8(<32 x i8> undef, <32 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 191 for instruction: %V64I8 = call { <64 x i8>, <64 x i1> } @llvm.sadd.with.overflow.v64i8(<64 x i8> undef, <64 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef<br>

+;<br>

+  %I64 = call {i64, i1} @llvm.sadd.with.overflow.i64(i64 undef, i64 undef)<br>

+  %V2I64 = call {<2 x i64>, <2 x i1>} @llvm.sadd.with.overflow.v2i64(<2 x i64> undef, <2 x i64> undef)<br>

+  %V4I64 = call {<4 x i64>, <4 x i1>} @llvm.sadd.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)<br>

+  %V8I64 = call {<8 x i64>, <8 x i1>} @llvm.sadd.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)<br>

+<br>

+  %I32 = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 undef, i32 undef)<br>

+  %V4I32  = call {<4 x i32>, <4 x i1>}  @llvm.sadd.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)<br>

+  %V8I32  = call {<8 x i32>, <8 x i1>}  @llvm.sadd.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)<br>

+  %V16I32 = call {<16 x i32>, <16 x i1>} @llvm.sadd.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)<br>

+<br>

+  %I16 = call {i16, i1} @llvm.sadd.with.overflow.i16(i16 undef, i16 undef)<br>

+  %V8I16  = call {<8 x i16>, <8 x i1>}  @llvm.sadd.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)<br>

+  %V16I16 = call {<16 x i16>, <16 x i1>} @llvm.sadd.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)<br>

+  %V32I16 = call {<32 x i16>, <32 x i1>} @llvm.sadd.with.overflow.v32i16(<32 x i16> undef, <32 x i16> undef)<br>

+<br>

+  %I8 = call {i8, i1} @llvm.sadd.with.overflow.i8(i8 undef, i8 undef)<br>

+  %V16I8 = call {<16 x i8>, <16 x i1>} @llvm.sadd.with.overflow.v16i8(<16 x i8> undef, <16 x i8> undef)<br>

+  %V32I8 = call {<32 x i8>, <32 x i1>} @llvm.sadd.with.overflow.v32i8(<32 x i8> undef, <32 x i8> undef)<br>

+  %V64I8 = call {<64 x i8>, <64 x i1>} @llvm.sadd.with.overflow.v64i8(<64 x i8> undef, <64 x i8> undef)<br>

+<br>

+  ret i32 undef<br>

+}<br>

+<br>

+;<br>

+; uadd.with.overflow<br>

+;<br>

+<br>

+declare {i64, i1}              @llvm.uadd.with.overflow.i64(i64, i64)<br>

+declare {<2 x i64>, <2 x i1>}  @llvm.uadd.with.overflow.v2i64(<2 x i64>, <2 x i64>)<br>

+declare {<4 x i64>, <4 x i1>}  @llvm.uadd.with.overflow.v4i64(<4 x i64>, <4 x i64>)<br>

+declare {<8 x i64>, <8 x i1>}  @llvm.uadd.with.overflow.v8i64(<8 x i64>, <8 x i64>)<br>

+<br>

+declare {i32, i1}               @llvm.uadd.with.overflow.i32(i32, i32)<br>

+declare {<4 x i32>, <4 x i1>}   @llvm.uadd.with.overflow.v4i32(<4 x i32>, <4 x i32>)<br>

+declare {<8 x i32>, <8 x i1>}   @llvm.uadd.with.overflow.v8i32(<8 x i32>, <8 x i32>)<br>

+declare {<16 x i32>, <16 x i1>} @llvm.uadd.with.overflow.v16i32(<16 x i32>, <16 x i32>)<br>

+<br>

+declare {i16, i1}               @llvm.uadd.with.overflow.i16(i16, i16)<br>

+declare {<8 x i16>,  <8 x i1>}  @llvm.uadd.with.overflow.v8i16(<8 x i16>, <8 x i16>)<br>

+declare {<16 x i16>, <16 x i1>} @llvm.uadd.with.overflow.v16i16(<16 x i16>, <16 x i16>)<br>

+declare {<32 x i16>, <32 x i1>} @llvm.uadd.with.overflow.v32i16(<32 x i16>, <32 x i16>)<br>

+<br>

+declare {i8, i1}                @llvm.uadd.with.overflow.i8(i8, i8)<br>

+declare {<16 x i8>, <16 x i1>}  @llvm.uadd.with.overflow.v16i8(<16 x i8>, <16 x i8>)<br>

+declare {<32 x i8>, <32 x i1>}  @llvm.uadd.with.overflow.v32i8(<32 x i8>, <32 x i8>)<br>

+declare {<64 x i8>, <64 x i1>}  @llvm.uadd.with.overflow.v64i8(<64 x i8>, <64 x i8>)<br>

+<br>

+define i32 @uadd(i32 %arg) {<br>

+; CHECK-LABEL: 'uadd'<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = call { i64, i1 } @llvm.uadd.with.overflow.i64(i64 undef, i64 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V2I64 = call { <2 x i64>, <2 x i1> } @llvm.uadd.with.overflow.v2i64(<2 x i64> undef, <2 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %V4I64 = call { <4 x i64>, <4 x i1> } @llvm.uadd.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I64 = call { <8 x i64>, <8 x i1> } @llvm.uadd.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 undef, i32 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %V4I32 = call { <4 x i32>, <4 x i1> } @llvm.uadd.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I32 = call { <8 x i32>, <8 x i1> } @llvm.uadd.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I32 = call { <16 x i32>, <16 x i1> } @llvm.uadd.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = call { i16, i1 } @llvm.uadd.with.overflow.i16(i16 undef, i16 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I16 = call { <8 x i16>, <8 x i1> } @llvm.uadd.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I16 = call { <16 x i16>, <16 x i1> } @llvm.uadd.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 95 for instruction: %V32I16 = call { <32 x i16>, <32 x i1> } @llvm.uadd.with.overflow.v32i16(<32 x i16> undef, <32 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = call { i8, i1 } @llvm.uadd.with.overflow.i8(i8 undef, i8 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I8 = call { <16 x i8>, <16 x i1> } @llvm.uadd.with.overflow.v16i8(<16 x i8> undef, <16 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 95 for instruction: %V32I8 = call { <32 x i8>, <32 x i1> } @llvm.uadd.with.overflow.v32i8(<32 x i8> undef, <32 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 191 for instruction: %V64I8 = call { <64 x i8>, <64 x i1> } @llvm.uadd.with.overflow.v64i8(<64 x i8> undef, <64 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef<br>

+;<br>

+  %I64 = call {i64, i1} @llvm.uadd.with.overflow.i64(i64 undef, i64 undef)<br>

+  %V2I64 = call {<2 x i64>, <2 x i1>} @llvm.uadd.with.overflow.v2i64(<2 x i64> undef, <2 x i64> undef)<br>

+  %V4I64 = call {<4 x i64>, <4 x i1>} @llvm.uadd.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)<br>

+  %V8I64 = call {<8 x i64>, <8 x i1>} @llvm.uadd.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)<br>

+<br>

+  %I32 = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 undef, i32 undef)<br>

+  %V4I32  = call {<4 x i32>, <4 x i1>}  @llvm.uadd.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)<br>

+  %V8I32  = call {<8 x i32>, <8 x i1>}  @llvm.uadd.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)<br>

+  %V16I32 = call {<16 x i32>, <16 x i1>} @llvm.uadd.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)<br>

+<br>

+  %I16 = call {i16, i1} @llvm.uadd.with.overflow.i16(i16 undef, i16 undef)<br>

+  %V8I16  = call {<8 x i16>, <8 x i1>}  @llvm.uadd.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)<br>

+  %V16I16 = call {<16 x i16>, <16 x i1>} @llvm.uadd.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)<br>

+  %V32I16 = call {<32 x i16>, <32 x i1>} @llvm.uadd.with.overflow.v32i16(<32 x i16> undef, <32 x i16> undef)<br>

+<br>

+  %I8 = call {i8, i1} @llvm.uadd.with.overflow.i8(i8 undef, i8 undef)<br>

+  %V16I8 = call {<16 x i8>, <16 x i1>} @llvm.uadd.with.overflow.v16i8(<16 x i8> undef, <16 x i8> undef)<br>

+  %V32I8 = call {<32 x i8>, <32 x i1>} @llvm.uadd.with.overflow.v32i8(<32 x i8> undef, <32 x i8> undef)<br>

+  %V64I8 = call {<64 x i8>, <64 x i1>} @llvm.uadd.with.overflow.v64i8(<64 x i8> undef, <64 x i8> undef)<br>

+<br>

+  ret i32 undef<br>

+}<br>

+<br>

+;<br>

+; ssub.with.overflow<br>

+;<br>

+<br>

+declare {i64, i1}              @llvm.ssub.with.overflow.i64(i64, i64)<br>

+declare {<2 x i64>, <2 x i1>}  @llvm.ssub.with.overflow.v2i64(<2 x i64>, <2 x i64>)<br>

+declare {<4 x i64>, <4 x i1>}  @llvm.ssub.with.overflow.v4i64(<4 x i64>, <4 x i64>)<br>

+declare {<8 x i64>, <8 x i1>}  @llvm.ssub.with.overflow.v8i64(<8 x i64>, <8 x i64>)<br>

+<br>

+declare {i32, i1}               @llvm.ssub.with.overflow.i32(i32, i32)<br>

+declare {<4 x i32>, <4 x i1>}   @llvm.ssub.with.overflow.v4i32(<4 x i32>, <4 x i32>)<br>

+declare {<8 x i32>, <8 x i1>}   @llvm.ssub.with.overflow.v8i32(<8 x i32>, <8 x i32>)<br>

+declare {<16 x i32>, <16 x i1>} @llvm.ssub.with.overflow.v16i32(<16 x i32>, <16 x i32>)<br>

+<br>

+declare {i16, i1}               @llvm.ssub.with.overflow.i16(i16, i16)<br>

+declare {<8 x i16>,  <8 x i1>}  @llvm.ssub.with.overflow.v8i16(<8 x i16>, <8 x i16>)<br>

+declare {<16 x i16>, <16 x i1>} @llvm.ssub.with.overflow.v16i16(<16 x i16>, <16 x i16>)<br>

+declare {<32 x i16>, <32 x i1>} @llvm.ssub.with.overflow.v32i16(<32 x i16>, <32 x i16>)<br>

+<br>

+declare {i8, i1}                @llvm.ssub.with.overflow.i8(i8, i8)<br>

+declare {<16 x i8>, <16 x i1>}  @llvm.ssub.with.overflow.v16i8(<16 x i8>, <16 x i8>)<br>

+declare {<32 x i8>, <32 x i1>}  @llvm.ssub.with.overflow.v32i8(<32 x i8>, <32 x i8>)<br>

+declare {<64 x i8>, <64 x i1>}  @llvm.ssub.with.overflow.v64i8(<64 x i8>, <64 x i8>)<br>

+<br>

+define i32 @ssub(i32 %arg) {<br>

+; CHECK-LABEL: 'ssub'<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = call { i64, i1 } @llvm.ssub.with.overflow.i64(i64 undef, i64 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V2I64 = call { <2 x i64>, <2 x i1> } @llvm.ssub.with.overflow.v2i64(<2 x i64> undef, <2 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %V4I64 = call { <4 x i64>, <4 x i1> } @llvm.ssub.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I64 = call { <8 x i64>, <8 x i1> } @llvm.ssub.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = call { i32, i1 } @llvm.ssub.with.overflow.i32(i32 undef, i32 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %V4I32 = call { <4 x i32>, <4 x i1> } @llvm.ssub.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I32 = call { <8 x i32>, <8 x i1> } @llvm.ssub.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I32 = call { <16 x i32>, <16 x i1> } @llvm.ssub.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = call { i16, i1 } @llvm.ssub.with.overflow.i16(i16 undef, i16 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I16 = call { <8 x i16>, <8 x i1> } @llvm.ssub.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I16 = call { <16 x i16>, <16 x i1> } @llvm.ssub.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 95 for instruction: %V32I16 = call { <32 x i16>, <32 x i1> } @llvm.ssub.with.overflow.v32i16(<32 x i16> undef, <32 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = call { i8, i1 } @llvm.ssub.with.overflow.i8(i8 undef, i8 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I8 = call { <16 x i8>, <16 x i1> } @llvm.ssub.with.overflow.v16i8(<16 x i8> undef, <16 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 95 for instruction: %V32I8 = call { <32 x i8>, <32 x i1> } @llvm.ssub.with.overflow.v32i8(<32 x i8> undef, <32 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 191 for instruction: %V64I8 = call { <64 x i8>, <64 x i1> } @llvm.ssub.with.overflow.v64i8(<64 x i8> undef, <64 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef<br>

+;<br>

+  %I64 = call {i64, i1} @llvm.ssub.with.overflow.i64(i64 undef, i64 undef)<br>

+  %V2I64 = call {<2 x i64>, <2 x i1>} @llvm.ssub.with.overflow.v2i64(<2 x i64> undef, <2 x i64> undef)<br>

+  %V4I64 = call {<4 x i64>, <4 x i1>} @llvm.ssub.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)<br>

+  %V8I64 = call {<8 x i64>, <8 x i1>} @llvm.ssub.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)<br>

+<br>

+  %I32 = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 undef, i32 undef)<br>

+  %V4I32  = call {<4 x i32>, <4 x i1>}  @llvm.ssub.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)<br>

+  %V8I32  = call {<8 x i32>, <8 x i1>}  @llvm.ssub.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)<br>

+  %V16I32 = call {<16 x i32>, <16 x i1>} @llvm.ssub.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)<br>

+<br>

+  %I16 = call {i16, i1} @llvm.ssub.with.overflow.i16(i16 undef, i16 undef)<br>

+  %V8I16  = call {<8 x i16>, <8 x i1>}  @llvm.ssub.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)<br>

+  %V16I16 = call {<16 x i16>, <16 x i1>} @llvm.ssub.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)<br>

+  %V32I16 = call {<32 x i16>, <32 x i1>} @llvm.ssub.with.overflow.v32i16(<32 x i16> undef, <32 x i16> undef)<br>

+<br>

+  %I8 = call {i8, i1} @llvm.ssub.with.overflow.i8(i8 undef, i8 undef)<br>

+  %V16I8 = call {<16 x i8>, <16 x i1>} @llvm.ssub.with.overflow.v16i8(<16 x i8> undef, <16 x i8> undef)<br>

+  %V32I8 = call {<32 x i8>, <32 x i1>} @llvm.ssub.with.overflow.v32i8(<32 x i8> undef, <32 x i8> undef)<br>

+  %V64I8 = call {<64 x i8>, <64 x i1>} @llvm.ssub.with.overflow.v64i8(<64 x i8> undef, <64 x i8> undef)<br>

+<br>

+  ret i32 undef<br>

+}<br>

+<br>

+;<br>

+; usub.with.overflow<br>

+;<br>

+<br>

+declare {i64, i1}              @llvm.usub.with.overflow.i64(i64, i64)<br>

+declare {<2 x i64>, <2 x i1>}  @llvm.usub.with.overflow.v2i64(<2 x i64>, <2 x i64>)<br>

+declare {<4 x i64>, <4 x i1>}  @llvm.usub.with.overflow.v4i64(<4 x i64>, <4 x i64>)<br>

+declare {<8 x i64>, <8 x i1>}  @llvm.usub.with.overflow.v8i64(<8 x i64>, <8 x i64>)<br>

+<br>

+declare {i32, i1}               @llvm.usub.with.overflow.i32(i32, i32)<br>

+declare {<4 x i32>, <4 x i1>}   @llvm.usub.with.overflow.v4i32(<4 x i32>, <4 x i32>)<br>

+declare {<8 x i32>, <8 x i1>}   @llvm.usub.with.overflow.v8i32(<8 x i32>, <8 x i32>)<br>

+declare {<16 x i32>, <16 x i1>} @llvm.usub.with.overflow.v16i32(<16 x i32>, <16 x i32>)<br>

+<br>

+declare {i16, i1}               @llvm.usub.with.overflow.i16(i16, i16)<br>

+declare {<8 x i16>,  <8 x i1>}  @llvm.usub.with.overflow.v8i16(<8 x i16>, <8 x i16>)<br>

+declare {<16 x i16>, <16 x i1>} @llvm.usub.with.overflow.v16i16(<16 x i16>, <16 x i16>)<br>

+declare {<32 x i16>, <32 x i1>} @llvm.usub.with.overflow.v32i16(<32 x i16>, <32 x i16>)<br>

+<br>

+declare {i8, i1}                @llvm.usub.with.overflow.i8(i8, i8)<br>

+declare {<16 x i8>, <16 x i1>}  @llvm.usub.with.overflow.v16i8(<16 x i8>, <16 x i8>)<br>

+declare {<32 x i8>, <32 x i1>}  @llvm.usub.with.overflow.v32i8(<32 x i8>, <32 x i8>)<br>

+declare {<64 x i8>, <64 x i1>}  @llvm.usub.with.overflow.v64i8(<64 x i8>, <64 x i8>)<br>

+<br>

+define i32 @usub(i32 %arg) {<br>

+; CHECK-LABEL: 'usub'<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = call { i64, i1 } @llvm.usub.with.overflow.i64(i64 undef, i64 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V2I64 = call { <2 x i64>, <2 x i1> } @llvm.usub.with.overflow.v2i64(<2 x i64> undef, <2 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %V4I64 = call { <4 x i64>, <4 x i1> } @llvm.usub.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I64 = call { <8 x i64>, <8 x i1> } @llvm.usub.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = call { i32, i1 } @llvm.usub.with.overflow.i32(i32 undef, i32 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %V4I32 = call { <4 x i32>, <4 x i1> } @llvm.usub.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I32 = call { <8 x i32>, <8 x i1> } @llvm.usub.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I32 = call { <16 x i32>, <16 x i1> } @llvm.usub.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = call { i16, i1 } @llvm.usub.with.overflow.i16(i16 undef, i16 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I16 = call { <8 x i16>, <8 x i1> } @llvm.usub.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I16 = call { <16 x i16>, <16 x i1> } @llvm.usub.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 95 for instruction: %V32I16 = call { <32 x i16>, <32 x i1> } @llvm.usub.with.overflow.v32i16(<32 x i16> undef, <32 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = call { i8, i1 } @llvm.usub.with.overflow.i8(i8 undef, i8 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I8 = call { <16 x i8>, <16 x i1> } @llvm.usub.with.overflow.v16i8(<16 x i8> undef, <16 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 95 for instruction: %V32I8 = call { <32 x i8>, <32 x i1> } @llvm.usub.with.overflow.v32i8(<32 x i8> undef, <32 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 191 for instruction: %V64I8 = call { <64 x i8>, <64 x i1> } @llvm.usub.with.overflow.v64i8(<64 x i8> undef, <64 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef<br>

+;<br>

+  %I64 = call {i64, i1} @llvm.usub.with.overflow.i64(i64 undef, i64 undef)<br>

+  %V2I64 = call {<2 x i64>, <2 x i1>} @llvm.usub.with.overflow.v2i64(<2 x i64> undef, <2 x i64> undef)<br>

+  %V4I64 = call {<4 x i64>, <4 x i1>} @llvm.usub.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)<br>

+  %V8I64 = call {<8 x i64>, <8 x i1>} @llvm.usub.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)<br>

+<br>

+  %I32 = call {i32, i1} @llvm.usub.with.overflow.i32(i32 undef, i32 undef)<br>

+  %V4I32  = call {<4 x i32>, <4 x i1>}  @llvm.usub.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)<br>

+  %V8I32  = call {<8 x i32>, <8 x i1>}  @llvm.usub.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)<br>

+  %V16I32 = call {<16 x i32>, <16 x i1>} @llvm.usub.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)<br>

+<br>

+  %I16 = call {i16, i1} @llvm.usub.with.overflow.i16(i16 undef, i16 undef)<br>

+  %V8I16  = call {<8 x i16>, <8 x i1>}  @llvm.usub.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)<br>

+  %V16I16 = call {<16 x i16>, <16 x i1>} @llvm.usub.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)<br>

+  %V32I16 = call {<32 x i16>, <32 x i1>} @llvm.usub.with.overflow.v32i16(<32 x i16> undef, <32 x i16> undef)<br>

+<br>

+  %I8 = call {i8, i1} @llvm.usub.with.overflow.i8(i8 undef, i8 undef)<br>

+  %V16I8 = call {<16 x i8>, <16 x i1>} @llvm.usub.with.overflow.v16i8(<16 x i8> undef, <16 x i8> undef)<br>

+  %V32I8 = call {<32 x i8>, <32 x i1>} @llvm.usub.with.overflow.v32i8(<32 x i8> undef, <32 x i8> undef)<br>

+  %V64I8 = call {<64 x i8>, <64 x i1>} @llvm.usub.with.overflow.v64i8(<64 x i8> undef, <64 x i8> undef)<br>

+<br>

+  ret i32 undef<br>

+}<br>

+<br>

+;<br>

+; smul.with.overflow<br>

+;<br>

+<br>

+declare {i64, i1}              @llvm.smul.with.overflow.i64(i64, i64)<br>

+declare {<2 x i64>, <2 x i1>}  @llvm.smul.with.overflow.v2i64(<2 x i64>, <2 x i64>)<br>

+declare {<4 x i64>, <4 x i1>}  @llvm.smul.with.overflow.v4i64(<4 x i64>, <4 x i64>)<br>

+declare {<8 x i64>, <8 x i1>}  @llvm.smul.with.overflow.v8i64(<8 x i64>, <8 x i64>)<br>

+<br>

+declare {i32, i1}               @llvm.smul.with.overflow.i32(i32, i32)<br>

+declare {<4 x i32>, <4 x i1>}   @llvm.smul.with.overflow.v4i32(<4 x i32>, <4 x i32>)<br>

+declare {<8 x i32>, <8 x i1>}   @llvm.smul.with.overflow.v8i32(<8 x i32>, <8 x i32>)<br>

+declare {<16 x i32>, <16 x i1>} @llvm.smul.with.overflow.v16i32(<16 x i32>, <16 x i32>)<br>

+<br>

+declare {i16, i1}               @llvm.smul.with.overflow.i16(i16, i16)<br>

+declare {<8 x i16>,  <8 x i1>}  @llvm.smul.with.overflow.v8i16(<8 x i16>, <8 x i16>)<br>

+declare {<16 x i16>, <16 x i1>} @llvm.smul.with.overflow.v16i16(<16 x i16>, <16 x i16>)<br>

+declare {<32 x i16>, <32 x i1>} @llvm.smul.with.overflow.v32i16(<32 x i16>, <32 x i16>)<br>

+<br>

+declare {i8, i1}                @llvm.smul.with.overflow.i8(i8, i8)<br>

+declare {<16 x i8>, <16 x i1>}  @llvm.smul.with.overflow.v16i8(<16 x i8>, <16 x i8>)<br>

+declare {<32 x i8>, <32 x i1>}  @llvm.smul.with.overflow.v32i8(<32 x i8>, <32 x i8>)<br>

+declare {<64 x i8>, <64 x i1>}  @llvm.smul.with.overflow.v64i8(<64 x i8>, <64 x i8>)<br>

+<br>

+define i32 @smul(i32 %arg) {<br>

+; CHECK-LABEL: 'smul'<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = call { i64, i1 } @llvm.smul.with.overflow.i64(i64 undef, i64 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V2I64 = call { <2 x i64>, <2 x i1> } @llvm.smul.with.overflow.v2i64(<2 x i64> undef, <2 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %V4I64 = call { <4 x i64>, <4 x i1> } @llvm.smul.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I64 = call { <8 x i64>, <8 x i1> } @llvm.smul.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = call { i32, i1 } @llvm.smul.with.overflow.i32(i32 undef, i32 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %V4I32 = call { <4 x i32>, <4 x i1> } @llvm.smul.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I32 = call { <8 x i32>, <8 x i1> } @llvm.smul.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I32 = call { <16 x i32>, <16 x i1> } @llvm.smul.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = call { i16, i1 } @llvm.smul.with.overflow.i16(i16 undef, i16 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I16 = call { <8 x i16>, <8 x i1> } @llvm.smul.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I16 = call { <16 x i16>, <16 x i1> } @llvm.smul.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 95 for instruction: %V32I16 = call { <32 x i16>, <32 x i1> } @llvm.smul.with.overflow.v32i16(<32 x i16> undef, <32 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = call { i8, i1 } @llvm.smul.with.overflow.i8(i8 undef, i8 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I8 = call { <16 x i8>, <16 x i1> } @llvm.smul.with.overflow.v16i8(<16 x i8> undef, <16 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 95 for instruction: %V32I8 = call { <32 x i8>, <32 x i1> } @llvm.smul.with.overflow.v32i8(<32 x i8> undef, <32 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 191 for instruction: %V64I8 = call { <64 x i8>, <64 x i1> } @llvm.smul.with.overflow.v64i8(<64 x i8> undef, <64 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef<br>

+;<br>

+  %I64 = call {i64, i1} @llvm.smul.with.overflow.i64(i64 undef, i64 undef)<br>

+  %V2I64 = call {<2 x i64>, <2 x i1>} @llvm.smul.with.overflow.v2i64(<2 x i64> undef, <2 x i64> undef)<br>

+  %V4I64 = call {<4 x i64>, <4 x i1>} @llvm.smul.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)<br>

+  %V8I64 = call {<8 x i64>, <8 x i1>} @llvm.smul.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)<br>

+<br>

+  %I32 = call {i32, i1} @llvm.smul.with.overflow.i32(i32 undef, i32 undef)<br>

+  %V4I32  = call {<4 x i32>, <4 x i1>}  @llvm.smul.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)<br>

+  %V8I32  = call {<8 x i32>, <8 x i1>}  @llvm.smul.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)<br>

+  %V16I32 = call {<16 x i32>, <16 x i1>} @llvm.smul.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)<br>

+<br>

+  %I16 = call {i16, i1} @llvm.smul.with.overflow.i16(i16 undef, i16 undef)<br>

+  %V8I16  = call {<8 x i16>, <8 x i1>}  @llvm.smul.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)<br>

+  %V16I16 = call {<16 x i16>, <16 x i1>} @llvm.smul.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)<br>

+  %V32I16 = call {<32 x i16>, <32 x i1>} @llvm.smul.with.overflow.v32i16(<32 x i16> undef, <32 x i16> undef)<br>

+<br>

+  %I8 = call {i8, i1} @llvm.smul.with.overflow.i8(i8 undef, i8 undef)<br>

+  %V16I8 = call {<16 x i8>, <16 x i1>} @llvm.smul.with.overflow.v16i8(<16 x i8> undef, <16 x i8> undef)<br>

+  %V32I8 = call {<32 x i8>, <32 x i1>} @llvm.smul.with.overflow.v32i8(<32 x i8> undef, <32 x i8> undef)<br>

+  %V64I8 = call {<64 x i8>, <64 x i1>} @llvm.smul.with.overflow.v64i8(<64 x i8> undef, <64 x i8> undef)<br>

+<br>

+  ret i32 undef<br>

+}<br>

+<br>

+;<br>

+; umul.with.overflow<br>

+;<br>

+<br>

+declare {i64, i1}              @llvm.umul.with.overflow.i64(i64, i64)<br>

+declare {<2 x i64>, <2 x i1>}  @llvm.umul.with.overflow.v2i64(<2 x i64>, <2 x i64>)<br>

+declare {<4 x i64>, <4 x i1>}  @llvm.umul.with.overflow.v4i64(<4 x i64>, <4 x i64>)<br>

+declare {<8 x i64>, <8 x i1>}  @llvm.umul.with.overflow.v8i64(<8 x i64>, <8 x i64>)<br>

+<br>

+declare {i32, i1}               @llvm.umul.with.overflow.i32(i32, i32)<br>

+declare {<4 x i32>, <4 x i1>}   @llvm.umul.with.overflow.v4i32(<4 x i32>, <4 x i32>)<br>

+declare {<8 x i32>, <8 x i1>}   @llvm.umul.with.overflow.v8i32(<8 x i32>, <8 x i32>)<br>

+declare {<16 x i32>, <16 x i1>} @llvm.umul.with.overflow.v16i32(<16 x i32>, <16 x i32>)<br>

+<br>

+declare {i16, i1}               @llvm.umul.with.overflow.i16(i16, i16)<br>

+declare {<8 x i16>,  <8 x i1>}  @llvm.umul.with.overflow.v8i16(<8 x i16>, <8 x i16>)<br>

+declare {<16 x i16>, <16 x i1>} @llvm.umul.with.overflow.v16i16(<16 x i16>, <16 x i16>)<br>

+declare {<32 x i16>, <32 x i1>} @llvm.umul.with.overflow.v32i16(<32 x i16>, <32 x i16>)<br>

+<br>

+declare {i8, i1}                @llvm.umul.with.overflow.i8(i8, i8)<br>

+declare {<16 x i8>, <16 x i1>}  @llvm.umul.with.overflow.v16i8(<16 x i8>, <16 x i8>)<br>

+declare {<32 x i8>, <32 x i1>}  @llvm.umul.with.overflow.v32i8(<32 x i8>, <32 x i8>)<br>

+declare {<64 x i8>, <64 x i1>}  @llvm.umul.with.overflow.v64i8(<64 x i8>, <64 x i8>)<br>

+<br>

+define i32 @umul(i32 %arg) {<br>

+; CHECK-LABEL: 'umul'<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 undef, i64 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V2I64 = call { <2 x i64>, <2 x i1> } @llvm.umul.with.overflow.v2i64(<2 x i64> undef, <2 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %V4I64 = call { <4 x i64>, <4 x i1> } @llvm.umul.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I64 = call { <8 x i64>, <8 x i1> } @llvm.umul.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 undef, i32 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %V4I32 = call { <4 x i32>, <4 x i1> } @llvm.umul.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I32 = call { <8 x i32>, <8 x i1> } @llvm.umul.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I32 = call { <16 x i32>, <16 x i1> } @llvm.umul.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = call { i16, i1 } @llvm.umul.with.overflow.i16(i16 undef, i16 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V8I16 = call { <8 x i16>, <8 x i1> } @llvm.umul.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I16 = call { <16 x i16>, <16 x i1> } @llvm.umul.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 95 for instruction: %V32I16 = call { <32 x i16>, <32 x i1> } @llvm.umul.with.overflow.v32i16(<32 x i16> undef, <32 x i16> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 undef, i8 undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %V16I8 = call { <16 x i8>, <16 x i1> } @llvm.umul.with.overflow.v16i8(<16 x i8> undef, <16 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 95 for instruction: %V32I8 = call { <32 x i8>, <32 x i1> } @llvm.umul.with.overflow.v32i8(<32 x i8> undef, <32 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 191 for instruction: %V64I8 = call { <64 x i8>, <64 x i1> } @llvm.umul.with.overflow.v64i8(<64 x i8> undef, <64 x i8> undef)<br>

+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef<br>

+;<br>

+  %I64 = call {i64, i1} @llvm.umul.with.overflow.i64(i64 undef, i64 undef)<br>

+  %V2I64 = call {<2 x i64>, <2 x i1>} @llvm.umul.with.overflow.v2i64(<2 x i64> undef, <2 x i64> undef)<br>

+  %V4I64 = call {<4 x i64>, <4 x i1>} @llvm.umul.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)<br>

+  %V8I64 = call {<8 x i64>, <8 x i1>} @llvm.umul.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)<br>

+<br>

+  %I32 = call {i32, i1} @llvm.umul.with.overflow.i32(i32 undef, i32 undef)<br>

+  %V4I32  = call {<4 x i32>, <4 x i1>}  @llvm.umul.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)<br>

+  %V8I32  = call {<8 x i32>, <8 x i1>}  @llvm.umul.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)<br>

+  %V16I32 = call {<16 x i32>, <16 x i1>} @llvm.umul.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)<br>

+<br>

+  %I16 = call {i16, i1} @llvm.umul.with.overflow.i16(i16 undef, i16 undef)<br>

+  %V8I16  = call {<8 x i16>, <8 x i1>}  @llvm.umul.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)<br>

+  %V16I16 = call {<16 x i16>, <16 x i1>} @llvm.umul.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)<br>

+  %V32I16 = call {<32 x i16>, <32 x i1>} @llvm.umul.with.overflow.v32i16(<32 x i16> undef, <32 x i16> undef)<br>

+<br>

+  %I8 = call {i8, i1} @llvm.umul.with.overflow.i8(i8 undef, i8 undef)<br>

+  %V16I8 = call {<16 x i8>, <16 x i1>} @llvm.umul.with.overflow.v16i8(<16 x i8> undef, <16 x i8> undef)<br>

+  %V32I8 = call {<32 x i8>, <32 x i1>} @llvm.umul.with.overflow.v32i8(<32 x i8> undef, <32 x i8> undef)<br>

+  %V64I8 = call {<64 x i8>, <64 x i1>} @llvm.umul.with.overflow.v64i8(<64 x i8> undef, <64 x i8> undef)<br>

+<br>

+  ret i32 undef<br>

+}<br>

<br>

Modified: llvm/trunk/utils/TableGen/CodeGenTarget.cpp<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenTarget.cpp?rev=351957&r1=351956&r2=351957&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenTarget.cpp?rev=351957&r1=351956&r2=351957&view=diff</a><br>

==============================================================================<br>

--- llvm/trunk/utils/TableGen/CodeGenTarget.cpp (original)<br>

+++ llvm/trunk/utils/TableGen/CodeGenTarget.cpp Wed Jan 23 08:00:22 2019<br>

@@ -633,7 +633,7 @@ CodeGenIntrinsic::CodeGenIntrinsic(Recor<br>

       // overloaded, all the types can be specified directly.<br>

       assert(((!TyEl->isSubClassOf("LLVMExtendedType") &&<br>

                !TyEl->isSubClassOf("LLVMTruncatedType") &&<br>

-               !TyEl->isSubClassOf("LLVMVectorSameWidth")) ||<br>

+               !TyEl->isSubClassOf("LLVMScalarOrSameVectorWidth")) ||<br>

               VT == MVT::iAny || VT == MVT::vAny) &&<br>

              "Expected iAny or vAny type");<br>

     } else<br>

<br>

Modified: llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp?rev=351957&r1=351956&r2=351957&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp?rev=351957&r1=351956&r2=351957&view=diff</a><br>

==============================================================================<br>

--- llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp (original)<br>

+++ llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp Wed Jan 23 08:00:22 2019<br>

@@ -269,7 +269,7 @@ static void EncodeFixedType(Record *R, s<br>

       Sig.push_back(IIT_TRUNC_ARG);<br>

     else if (R->isSubClassOf("LLVMHalfElementsVectorType"))<br>

       Sig.push_back(IIT_HALF_VEC_ARG);<br>

-    else if (R->isSubClassOf("LLVMVectorSameWidth")) {<br>

+    else if (R->isSubClassOf("LLVMScalarOrSameVectorWidth")) {<br>

       Sig.push_back(IIT_SAME_VEC_WIDTH_ARG);<br>

       Sig.push_back((Number << 3) | ArgCodes[Number]);<br>

       MVT::SimpleValueType VT = getValueType(R->getValueAsDef("ElTy"));<br>

<br>

<br>

_______________________________________________<br>

llvm-commits mailing list<br>

<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>

</blockquote></div>