[PATCH] Make bitcast, extractelement, and insertelement considered cheap for speculation in SimplifyCFG.

Hal Finkel hfinkel at anl.gov
Fri Dec 13 17:16:47 PST 2013


----- Original Message -----
> From: "Hal Finkel" <hfinkel at anl.gov>
> To: "Matt Arsenault" <Matthew.Arsenault at amd.com>
> Cc: "llvm commits" <llvm-commits at cs.uiuc.edu>, reviews+D2218+public+1b17721e58b4714d at llvm-reviews.chandlerc.com
> Sent: Tuesday, November 19, 2013 1:23:41 PM
> Subject: Re: [PATCH] Make bitcast, extractelement, and insertelement	considered	cheap for speculation in SimplifyCFG.
> 
> ----- Original Message -----
> > From: "Matt Arsenault" <Matthew.Arsenault at amd.com>
> > To: "Nadav Rotem" <nrotem at apple.com>,
> > reviews+D2218+public+1b17721e58b4714d at llvm-reviews.chandlerc.com
> > Cc: "llvm commits" <llvm-commits at cs.uiuc.edu>
> > Sent: Tuesday, November 19, 2013 1:12:01 PM
> > Subject: Re: [PATCH] Make bitcast, extractelement, and
> > insertelement considered	cheap for speculation in SimplifyCFG.
> > 
> > On 11/19/2013 08:42 AM, Nadav Rotem wrote:
> > > Hi Matt,
> > >
> > > I don’t know this code very well but your patch looks okay to me.
> > >   Please benchmark the LLVM test suite and make sure that this
> > > change does not introduce new regressions.
> > >
> > > SimplifyCFG should not use TTI because this pass is executed very
> > > early in the optimization pipe.  TTI should only be used by
> > > “lowering” transformations in the late stages of the optimization
> > > pipe.
> > But it is included in SimplifyCFG, and it is used in a few places
> > there.
> > Are these supposed to be removed at some point?
> 
> No, TTI is used in a few other places in order to check for "free"
> operations. We just don't use the costs until later in the pipeline.

What ended up happening with this?

 -Hal

> 
>  -Hal
> 
> > 
> > > Thanks,
> > > Nadav
> > >
> > >
> > > On Nov 18, 2013, at 7:00 PM, Matt Arsenault
> > > <Matthew.Arsenault at amd.com> wrote:
> > >
> > >> Hi nadav,
> > >>
> > >> This help fold more branches into selects.
> > >> On R600, vectors are cheap and anything that helps
> > >> remove branches is very good.
> > >>
> > >> I don't know why this doesn't use the TTI for this cost
> > >> calculation or if it should.
> > >>
> > >> http://llvm-reviews.chandlerc.com/D2218
> > >>
> > >> Files:
> > >>   lib/Transforms/Utils/SimplifyCFG.cpp
> > >>   test/Transforms/SimplifyCFG/speculate-vector-ops.ll
> > >>
> > >> Index: lib/Transforms/Utils/SimplifyCFG.cpp
> > >> ===================================================================
> > >> --- lib/Transforms/Utils/SimplifyCFG.cpp
> > >> +++ lib/Transforms/Utils/SimplifyCFG.cpp
> > >> @@ -224,6 +224,9 @@
> > >>    case Instruction::Trunc:
> > >>    case Instruction::ZExt:
> > >>    case Instruction::SExt:
> > >> +  case Instruction::BitCast:
> > >> +  case Instruction::ExtractElement:
> > >> +  case Instruction::InsertElement:
> > >>      return 1; // These are all cheap.
> > >>
> > >>    case Instruction::Call:
> > >> Index: test/Transforms/SimplifyCFG/speculate-vector-ops.ll
> > >> ===================================================================
> > >> --- /dev/null
> > >> +++ test/Transforms/SimplifyCFG/speculate-vector-ops.ll
> > >> @@ -0,0 +1,60 @@
> > >> +; RUN: opt -S -simplifycfg < %s | FileCheck %s
> > >> +
> > >> +define i32 @speculate_vector_extract(i32 %d, <4 x i32> %v) #0 {
> > >> +; CHECK-LABEL: @speculate_vector_extract(
> > >> +; CHECK-NOT: br
> > >> +entry:
> > >> +  %conv = insertelement <4 x i32> undef, i32 %d, i32 0
> > >> +  %conv2 = insertelement <4 x i32> %conv, i32 %d, i32 1
> > >> +  %conv3 = insertelement <4 x i32> %conv2, i32 %d, i32 2
> > >> +  %conv4 = insertelement <4 x i32> %conv3, i32 %d, i32 3
> > >> +  %tmp6 = add nsw <4 x i32> %conv4, <i32 0, i32 -1, i32 -2, i32
> > >> -3>
> > >> +  %cmp = icmp eq <4 x i32> %tmp6, zeroinitializer
> > >> +  %cmp.ext = sext <4 x i1> %cmp to <4 x i32>
> > >> +  %tmp8 = extractelement <4 x i32> %cmp.ext, i32 0
> > >> +  %tobool = icmp eq i32 %tmp8, 0
> > >> +  br i1 %tobool, label %cond.else, label %cond.then
> > >> +
> > >> +return:                                           ; preds =
> > >> %cond.end28
> > >> +  ret i32 %cond32
> > >> +
> > >> +cond.then:                                        ; preds =
> > >> %entry
> > >> +  %tmp10 = extractelement <4 x i32> %v, i32 0
> > >> +  br label %cond.end
> > >> +
> > >> +cond.else:                                        ; preds =
> > >> %entry
> > >> +  %tmp12 = extractelement <4 x i32> %v, i32 3
> > >> +  br label %cond.end
> > >> +
> > >> +cond.end:                                         ; preds =
> > >> %cond.else, %cond.then
> > >> +  %cond = phi i32 [ %tmp10, %cond.then ], [ %tmp12, %cond.else
> > >> ]
> > >> +  %tmp14 = extractelement <4 x i32> %cmp.ext, i32 1
> > >> +  %tobool15 = icmp eq i32 %tmp14, 0
> > >> +  br i1 %tobool15, label %cond.else17, label %cond.then16
> > >> +
> > >> +cond.then16:                                      ; preds =
> > >> %cond.end
> > >> +  %tmp20 = extractelement <4 x i32> %v, i32 1
> > >> +  br label %cond.end18
> > >> +
> > >> +cond.else17:                                      ; preds =
> > >> %cond.end
> > >> +  br label %cond.end18
> > >> +
> > >> +cond.end18:                                       ; preds =
> > >> %cond.else17, %cond.then16
> > >> +  %cond22 = phi i32 [ %tmp20, %cond.then16 ], [ %cond,
> > >> %cond.else17 ]
> > >> +  %tmp24 = extractelement <4 x i32> %cmp.ext, i32 2
> > >> +  %tobool25 = icmp eq i32 %tmp24, 0
> > >> +  br i1 %tobool25, label %cond.else27, label %cond.then26
> > >> +
> > >> +cond.then26:                                      ; preds =
> > >> %cond.end18
> > >> +  %tmp30 = extractelement <4 x i32> %v, i32 2
> > >> +  br label %cond.end28
> > >> +
> > >> +cond.else27:                                      ; preds =
> > >> %cond.end18
> > >> +  br label %cond.end28
> > >> +
> > >> +cond.end28:                                       ; preds =
> > >> %cond.else27, %cond.then26
> > >> +  %cond32 = phi i32 [ %tmp30, %cond.then26 ], [ %cond22,
> > >> %cond.else27 ]
> > >> +  br label %return
> > >> +}
> > >> +
> > >> +attributes #0 = { nounwind }
> > >> <D2218.1.patch>
> > >
> > 
> > 
> > 
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory




More information about the llvm-commits mailing list