[LLVMdev] Missed optimisation opportunities?

Nick Lewycky nicholas at mxc.ca
Sat Mar 30 13:56:03 PDT 2013


Jeremy Lakeman wrote:
> I'm writing a front end for an existing interpreted language with
> slightly odd semantics for primitive values.
> Similar to the values in a database table, any value could be null, even
> for non-pointer types.
> For example a boolean variable could be true, false, or null.
> To model this behaviour, I'm passing an {i1, [type]} around for every
> numeric type. And using insertvalue / extractvalue all over the place.
>
> So when compiling this simple CS example;
>
> function long fib (long al_x)
> long ll_i=0, ll_ret=0, ll_last=0, ll_last2=1
>
> if isnull(al_x) then
>      return al_x
> end if
>
> ll_i=1
> do while ll_i<=al_x
>      ll_ret=ll_last+ll_last2
>      ll_last2=ll_last
>      ll_last=ll_ret
>      ll_i++
> loop
> return ll_ret
> end function
>
> With the following function passes;
>          FunctionPasses->add(new DataLayout(*JITEngine->getDataLayout()));
>          FunctionPasses->add(createBasicAliasAnalysisPass());
>          FunctionPasses->add(createPromoteMemoryToRegisterPass());
>          FunctionPasses->add(createInstructionCombiningPass());
>          FunctionPasses->add(createReassociatePass());
>          FunctionPasses->add(createGVNPass());
>          FunctionPasses->add(createCFGSimplificationPass());
>
> Which does a pretty good job of cleaning up most of the messy & verbose
> code generated by my front end.
>
> I end up with the following bitcode;
>
> define x86_stdcallcc { i1, i32 } @fib({ i1, i32 } %arg_al_x) {
> entry:
>    %"3_4" = extractvalue { i1, i32 } %arg_al_x, 0
>    br i1 %"3_4", label %Line15_Offset120, label %Line8_Offset38
>
> Line8_Offset38:                                   ; preds =
> %Line9_Offset52, %entry
>    %ll_last2.0 = phi { i1, i32 } [ %ll_last.0, %Line9_Offset52 ], [ { i1
> false, i32 1 }, %entry ]
>    %ll_last.0 = phi { i1, i32 } [ %"9_64_composed", %Line9_Offset52 ], [
> zeroinitializer, %entry ]
>    %ll_i.0 = phi { i1, i32 } [ %"12_98", %Line9_Offset52 ], [ { i1
> false, i32 1 }, %entry ]
>    %"8_38_isnull" = extractvalue { i1, i32 } %ll_i.0, 0
>    %"8_38_value" = extractvalue { i1, i32 } %ll_i.0, 1
>    %"8_42_value" = extractvalue { i1, i32 } %arg_al_x, 1
>    %"8_46_value" = icmp sle i32 %"8_38_value", %"8_42_value"
>    %"8_48_not_null" = xor i1 %"8_38_isnull", true
>    %"8_48_and_not_null" = and i1 %"8_46_value", %"8_48_not_null"
>    br i1 %"8_48_and_not_null", label %Line9_Offset52, label
> %Line15_Offset120
>
> Line9_Offset52:                                   ; preds = %Line8_Offset38
>    %"9_60_isnull" = extractvalue { i1, i32 } %ll_last2.0, 0
>    %"9_56_isnull" = extractvalue { i1, i32 } %ll_last.0, 0
>    %"9_64_isnull" = or i1 %"9_56_isnull", %"9_60_isnull"
>    %"9_56_value" = extractvalue { i1, i32 } %ll_last.0, 1
>    %"9_60_value" = extractvalue { i1, i32 } %ll_last2.0, 1
>    %"9_64" = add i32 %"9_56_value", %"9_60_value"
>    %0 = insertvalue { i1, i32 } zeroinitializer, i1 %"9_64_isnull", 0
>    %"9_64_composed" = insertvalue { i1, i32 } %0, i32 %"9_64", 1
>    %"12_98_value" = add i32 %"8_38_value", 1
>    %1 = insertvalue { i1, i32 } zeroinitializer, i1 %"8_38_isnull", 0
>    %"12_98" = insertvalue { i1, i32 } %1, i32 %"12_98_value", 1
>    br label %Line8_Offset38
>
> Line15_Offset120:                                 ; preds =
> %Line8_Offset38, %entry
>    %_return_value.0 = phi { i1, i32 } [ %arg_al_x, %entry ], [
> %ll_last.0, %Line8_Offset38 ]
>    ret { i1, i32 } %_return_value.0
> }
>
> None of the local variables in this example can ever be null. Each of
> these extractvalue expressions will always evaluate to i1 0;
>
>    %"8_38_isnull" = extractvalue { i1, i32 } %ll_i.0, 0
>    %"9_60_isnull" = extractvalue { i1, i32 } %ll_last2.0, 0
>    %"9_56_isnull" = extractvalue { i1, i32 } %ll_last.0, 0
>
> Is there an existing function pass I could add that would clean this up?
> Or is this example just slightly too complex?

I don't think any pass in llvm will clean this up, but you can check 
that by running opt -O2. Off the top of my head, I think the place for 
this optimization would be jump-threading, by changing it to reason 
about individual members inside a struct.

Nick



More information about the llvm-dev mailing list