[LLVMdev] Missing Optimization Opportunities

Benjamin Kramer benny.kra at googlemail.com
Fri Sep 10 16:02:58 PDT 2010


On 10.09.2010, at 16:08, Mai, Haohui wrote:

> Hi,
> 
> I'm using LLVM 2.7 right now, and I found "opt -std-compile-opts" has
> missed some opportunities for optimization:
> 
> define void @spa.main() readonly {
> entry:
>  %tmp = load i32* @dst-ip                        ; <i32> [#uses=3]
>  %tmp1 = and i32 %tmp, -16777216                 ; <i32> [#uses=1]
>  %tmp2 = icmp eq i32 %tmp1, 167772160            ; <i1> [#uses=2]
>  %tmp3 = and i32 %tmp, -65536                    ; <i32> [#uses=2]
>  %tmp4 = icmp ne i32 %tmp3, 168296448            ; <i1> [#uses=1]
>  %tmp5 = and i1 %tmp2, %tmp4                     ; <i1> [#uses=1]
>  %tmp6 = and i32 %tmp, -256                      ; <i32> [#uses=2]
>  %tmp7 = icmp eq i32 %tmp6, 168296704            ; <i1> [#uses=1]
>  %tmp8 = icmp eq i32 %tmp3, 168296448            ; <i1> [#uses=2]
>  %tmp9 = icmp ne i32 %tmp6, 168296704            ; <i1> [#uses=1]
>  %tmp10 = and i1 %tmp8, %tmp9                    ; <i1> [#uses=1]
>  %tmp11 = load i32* @src-ip                      ; <i32> [#uses=1]
>  %tmp12 = and i32 %tmp11, -16777216              ; <i32> [#uses=1]
>  %tmp13 = icmp eq i32 %tmp12, 721420288          ; <i1> [#uses=3]
>  %tmp14 = and i1 %tmp2, %tmp5                    ; <i1> [#uses=1]
>  %tmp15 = and i1 %tmp13, %tmp14                  ; <i1> [#uses=1]
>  tail call void @spa.assert(i1 %tmp15)
>  %tmp16 = and i1 %tmp8, %tmp10                   ; <i1> [#uses=1]
>  %tmp17 = and i1 %tmp13, %tmp16                  ; <i1> [#uses=1]
>  tail call void @spa.assert(i1 %tmp17)
>  %tmp18 = and i1 %tmp13, %tmp7                   ; <i1> [#uses=1]
>  tail call void @spa.assert(i1 %tmp18)
>  ret void
> }
> 
> Please notice the following code sequences are not optimal:
> 
> %tmp5 = and i1 %tmp2, %tmp4                     ; <i1> [#uses=1]
> %tmp14 = and i1 %tmp2, %tmp5                    ; <i1> [#uses=1]
> 
> Is it a problem of scalar evolution or I just need to call another pass?

The reassociate pass used to do this but it was disabled for "i1" values because
it pessimized some code (r95333). I pushed a fix (r113651) to trunk which adds support
for this optimization to the InstructionSimplify analysis (which is run, among others,
by the Instcombine pass).

Thanks for bringing this up. :)



More information about the llvm-dev mailing list