[llvm-commits] [llvm] r122704 - in /llvm/trunk: lib/Transforms/Scalar/LoopIdiomRecognize.cpp test/Transforms/LoopIdiom/basic.ll

Chris Lattner clattner at apple.com
Mon Jan 3 15:37:24 PST 2011


On Jan 3, 2011, at 1:24 AM, Duncan Sands wrote:

> Hi Chris,
> 
>> enhance loop idiom recognition to scan *all* unconditionally executed
>> blocks in a loop, instead of just the header block.  This makes it more
>> aggressive, able to handle Duncan's Ada examples.
> 
> thanks for doing this!  I noticed two issues with the examples I sent you,
> which now compile to

I didn't keep the .ll files, otherwise I'd answer these questions myself:

> define void @ubytezero([256 x i32]* %a) nounwind {
> return:
>   %tmp32 = getelementptr [256 x i32]* %a, i32 0, i32 0
>   store i32 0, i32* %tmp32, align 4
>   %scevgep = getelementptr [256 x i32]* %a, i32 0, i32 1
>   %scevgep4 = bitcast i32* %scevgep to i8*
>   call void @llvm.memset.p0i8.i32(i8* %scevgep4, i8 0, i32 1020, i32 4, i1 false)
>   ret void
> }
> 
> define void @uintzero(i32* %a) nounwind {
> return:
>   store i32 0, i32* %a, align 4
>   %scevgep = getelementptr i32* %a, i32 1
>   %scevgep3 = bitcast i32* %scevgep to i8*
>   call void @llvm.memset.p0i8.i32(i8* %scevgep3, i8 0, i32 -4, i32 4, i1 false)
>   ret void
> }
> 
> In both functions the memset could also take care of the store to the first
> element, rather than starting from the second element.  However maybe merging
> the store and the memset should be a job for a different pass.

It's likely that this should be done by memcpy opt, which has logic for merging multiple consecutive stores into a memset.

> Secondly, notice that the size of the memset in the second function is -4.
> Hopefully this will work correctly (i.e. memset 2^32-4 values)!  It did make
> me wonder what happens if the loop stores say 2^32 or 2^48 values.  Presumably
> the type of the memset size argument is automagically set to a size that can
> hold the loop trip count...

I don't understand what you're saying here... is there a miscompilation, or was the original code doing this large store?  -4 is "a big 32-bit number" and is zero extended to i64 on 64-bit targets.

-Chris





More information about the llvm-commits mailing list