[PATCH] Make bitcast, extractelement, and insertelement considered cheap for speculation in SimplifyCFG.

Matt Arsenault Matthew.Arsenault at amd.com
Tue Nov 19 11:12:01 PST 2013


On 11/19/2013 08:42 AM, Nadav Rotem wrote:
> Hi Matt,
>
> I don’t know this code very well but your patch looks okay to me.   Please benchmark the LLVM test suite and make sure that this change does not introduce new regressions.
>
> SimplifyCFG should not use TTI because this pass is executed very early in the optimization pipe.  TTI should only be used by “lowering” transformations in the late stages of the optimization pipe.
But it is included in SimplifyCFG, and it is used in a few places there. 
Are these supposed to be removed at some point?

> Thanks,
> Nadav
>
>
> On Nov 18, 2013, at 7:00 PM, Matt Arsenault <Matthew.Arsenault at amd.com> wrote:
>
>> Hi nadav,
>>
>> This help fold more branches into selects.
>> On R600, vectors are cheap and anything that helps
>> remove branches is very good.
>>
>> I don't know why this doesn't use the TTI for this cost
>> calculation or if it should.
>>
>> http://llvm-reviews.chandlerc.com/D2218
>>
>> Files:
>>   lib/Transforms/Utils/SimplifyCFG.cpp
>>   test/Transforms/SimplifyCFG/speculate-vector-ops.ll
>>
>> Index: lib/Transforms/Utils/SimplifyCFG.cpp
>> ===================================================================
>> --- lib/Transforms/Utils/SimplifyCFG.cpp
>> +++ lib/Transforms/Utils/SimplifyCFG.cpp
>> @@ -224,6 +224,9 @@
>>    case Instruction::Trunc:
>>    case Instruction::ZExt:
>>    case Instruction::SExt:
>> +  case Instruction::BitCast:
>> +  case Instruction::ExtractElement:
>> +  case Instruction::InsertElement:
>>      return 1; // These are all cheap.
>>
>>    case Instruction::Call:
>> Index: test/Transforms/SimplifyCFG/speculate-vector-ops.ll
>> ===================================================================
>> --- /dev/null
>> +++ test/Transforms/SimplifyCFG/speculate-vector-ops.ll
>> @@ -0,0 +1,60 @@
>> +; RUN: opt -S -simplifycfg < %s | FileCheck %s
>> +
>> +define i32 @speculate_vector_extract(i32 %d, <4 x i32> %v) #0 {
>> +; CHECK-LABEL: @speculate_vector_extract(
>> +; CHECK-NOT: br
>> +entry:
>> +  %conv = insertelement <4 x i32> undef, i32 %d, i32 0
>> +  %conv2 = insertelement <4 x i32> %conv, i32 %d, i32 1
>> +  %conv3 = insertelement <4 x i32> %conv2, i32 %d, i32 2
>> +  %conv4 = insertelement <4 x i32> %conv3, i32 %d, i32 3
>> +  %tmp6 = add nsw <4 x i32> %conv4, <i32 0, i32 -1, i32 -2, i32 -3>
>> +  %cmp = icmp eq <4 x i32> %tmp6, zeroinitializer
>> +  %cmp.ext = sext <4 x i1> %cmp to <4 x i32>
>> +  %tmp8 = extractelement <4 x i32> %cmp.ext, i32 0
>> +  %tobool = icmp eq i32 %tmp8, 0
>> +  br i1 %tobool, label %cond.else, label %cond.then
>> +
>> +return:                                           ; preds = %cond.end28
>> +  ret i32 %cond32
>> +
>> +cond.then:                                        ; preds = %entry
>> +  %tmp10 = extractelement <4 x i32> %v, i32 0
>> +  br label %cond.end
>> +
>> +cond.else:                                        ; preds = %entry
>> +  %tmp12 = extractelement <4 x i32> %v, i32 3
>> +  br label %cond.end
>> +
>> +cond.end:                                         ; preds = %cond.else, %cond.then
>> +  %cond = phi i32 [ %tmp10, %cond.then ], [ %tmp12, %cond.else ]
>> +  %tmp14 = extractelement <4 x i32> %cmp.ext, i32 1
>> +  %tobool15 = icmp eq i32 %tmp14, 0
>> +  br i1 %tobool15, label %cond.else17, label %cond.then16
>> +
>> +cond.then16:                                      ; preds = %cond.end
>> +  %tmp20 = extractelement <4 x i32> %v, i32 1
>> +  br label %cond.end18
>> +
>> +cond.else17:                                      ; preds = %cond.end
>> +  br label %cond.end18
>> +
>> +cond.end18:                                       ; preds = %cond.else17, %cond.then16
>> +  %cond22 = phi i32 [ %tmp20, %cond.then16 ], [ %cond, %cond.else17 ]
>> +  %tmp24 = extractelement <4 x i32> %cmp.ext, i32 2
>> +  %tobool25 = icmp eq i32 %tmp24, 0
>> +  br i1 %tobool25, label %cond.else27, label %cond.then26
>> +
>> +cond.then26:                                      ; preds = %cond.end18
>> +  %tmp30 = extractelement <4 x i32> %v, i32 2
>> +  br label %cond.end28
>> +
>> +cond.else27:                                      ; preds = %cond.end18
>> +  br label %cond.end28
>> +
>> +cond.end28:                                       ; preds = %cond.else27, %cond.then26
>> +  %cond32 = phi i32 [ %tmp30, %cond.then26 ], [ %cond22, %cond.else27 ]
>> +  br label %return
>> +}
>> +
>> +attributes #0 = { nounwind }
>> <D2218.1.patch>
>






More information about the llvm-commits mailing list