[llvm] r204880 - X86: Correct vectorization cost model for v8f32->v8i8.

Thu Mar 27 11:06:19 PDT 2014

On Thu, Mar 27, 2014 at 11:03 AM, Nadav Rotem <nrotem at apple.com> wrote:
> This looks like a codegen problem and not an ISA problem.  Can we convert
> v8f32->v8i32 efficiently? Yes.  Can we convert v8i32->v8i8 efficiently? Yes.
> Can we do it in less than 7 instructions? Probably.

How many? I haven't looked at it in detail, but the codegen out of the
compiler seems to be cleaned up pretty well, post legalization. Do you
have an idea of how few we could do it in?

-eric

>
>
> On Mar 27, 2014, at 9:40 AM, Eric Christopher <echristo at gmail.com> wrote:
>
>
> On Mar 27, 2014 8:41 AM, "Jim Grosbach" <grosbach at apple.com> wrote:
>>
>>
>>
>> > On Mar 26, 2014, at 10:32 PM, Eric Christopher <echristo at gmail.com>
>> > wrote:
>> >
>> >> On Wed, Mar 26, 2014 at 5:04 PM, Jim Grosbach <grosbach at apple.com>
>> >> wrote:
>> >> Author: grosbach
>> >> Date: Wed Mar 26 19:04:11 2014
>> >> New Revision: 204880
>> >>
>> >> URL: http://llvm.org/viewvc/llvm-project?rev=204880&view=rev
>> >> Log:
>> >> X86: Correct vectorization cost model for v8f32->v8i8.
>> >>
>> >> Fix the cost model to reflect the reality of our codegen.
>> >
>> > Reality of our codegen or reality of the processors?
>> >
>>
>> The latter, though the cost model should be accurately reflecting both.
>
> It should, but in the latter a bug should be filed and a comment to that
> effect listed. "Realities of our CodeGen" really sounds like a deficiency
> we're papering over.
>
> -eric
>
> PS. Reading more of the patches I'll just CC Quentin on this response too.
> :)
>
>> > -eric
>> >
>> >>
>> >> rdar://16370633
>> >>
>> >> Added:
>> >>
>> >> llvm/trunk/test/Transforms/LoopVectorize/X86/fp_to_sint8-cost-model.ll
>> >> Modified:
>> >>    llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp
>> >>
>> >> Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp
>> >> URL:
>> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp?rev=204880&r1=204879&r2=204880&view=diff
>> >>
>> >> ==============================================================================
>> >> --- llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp (original)
>> >> +++ llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp Wed Mar 26
>> >> 19:04:11 2014
>> >> @@ -513,7 +513,7 @@ unsigned X86TTI::getCastInstrCost(unsign
>> >>     { ISD::UINT_TO_FP,  MVT::v4f64, MVT::v4i16, 2 },
>> >>     { ISD::UINT_TO_FP,  MVT::v4f64, MVT::v4i32, 6 },
>> >>
>> >> -    { ISD::FP_TO_SINT,  MVT::v8i8,  MVT::v8f32, 1 },
>> >> +    { ISD::FP_TO_SINT,  MVT::v8i8,  MVT::v8f32, 7 },
>> >>     { ISD::FP_TO_SINT,  MVT::v4i8,  MVT::v4f32, 1 },
>> >>   };
>> >>
>> >>
>> >> Added:
>> >> llvm/trunk/test/Transforms/LoopVectorize/X86/fp_to_sint8-cost-model.ll
>> >> URL:
>> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/X86/fp_to_sint8-cost-model.ll?rev=204880&view=auto
>> >>
>> >> ==============================================================================
>> >> ---
>> >> llvm/trunk/test/Transforms/LoopVectorize/X86/fp_to_sint8-cost-model.ll
>> >> (added)
>> >> +++
>> >> llvm/trunk/test/Transforms/LoopVectorize/X86/fp_to_sint8-cost-model.ll Wed
>> >> Mar 26 19:04:11 2014
>> >> @@ -0,0 +1,24 @@
>> >> +; RUN: opt < %s  -loop-vectorize -mtriple=x86_64-apple-macosx10.8.0
>> >> -mcpu=corei7-avx -S -debug-only=loop-vectorize 2>&1 | FileCheck %s
>> >> +
>> >> +target datalayout =
>> >> "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
>> >> +target triple = "x86_64-apple-macosx10.8.0"
>> >> +
>> >> +
>> >> +; CHECK: cost of 7 for VF 8 For instruction:   %conv = fptosi float
>> >> %tmp to i8
>> >> +define void @float_to_sint8_cost(i8* noalias nocapture %a, float*
>> >> noalias nocapture readonly %b) nounwind {
>> >> +entry:
>> >> +  br label %for.body
>> >> +for.body:
>> >> +  %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
>> >> +  %arrayidx = getelementptr inbounds float* %b, i64 %indvars.iv
>> >> +  %tmp = load float* %arrayidx, align 4
>> >> +  %conv = fptosi float %tmp to i8
>> >> +  %arrayidx2 = getelementptr inbounds i8* %a, i64 %indvars.iv
>> >> +  store i8 %conv, i8* %arrayidx2, align 4
>> >> +  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
>> >> +  %exitcond = icmp eq i64 %indvars.iv.next, 256
>> >> +  br i1 %exitcond, label %for.end, label %for.body
>> >> +
>> >> +for.end:
>> >> +  ret void
>> >> +}
>> >>
>> >>
>> >> _______________________________________________
>> >> llvm-commits mailing list
>> >> llvm-commits at cs.uiuc.edu
>> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>