[llvm-commits] [llvm] r122704 - in /llvm/trunk: lib/Transforms/Scalar/LoopIdiomRecognize.cpp test/Transforms/LoopIdiom/basic.ll
Chris Lattner
clattner at apple.com
Mon Jan 3 15:37:24 PST 2011
On Jan 3, 2011, at 1:24 AM, Duncan Sands wrote:
> Hi Chris,
>
>> enhance loop idiom recognition to scan *all* unconditionally executed
>> blocks in a loop, instead of just the header block. This makes it more
>> aggressive, able to handle Duncan's Ada examples.
>
> thanks for doing this! I noticed two issues with the examples I sent you,
> which now compile to
I didn't keep the .ll files, otherwise I'd answer these questions myself:
> define void @ubytezero([256 x i32]* %a) nounwind {
> return:
> %tmp32 = getelementptr [256 x i32]* %a, i32 0, i32 0
> store i32 0, i32* %tmp32, align 4
> %scevgep = getelementptr [256 x i32]* %a, i32 0, i32 1
> %scevgep4 = bitcast i32* %scevgep to i8*
> call void @llvm.memset.p0i8.i32(i8* %scevgep4, i8 0, i32 1020, i32 4, i1 false)
> ret void
> }
>
> define void @uintzero(i32* %a) nounwind {
> return:
> store i32 0, i32* %a, align 4
> %scevgep = getelementptr i32* %a, i32 1
> %scevgep3 = bitcast i32* %scevgep to i8*
> call void @llvm.memset.p0i8.i32(i8* %scevgep3, i8 0, i32 -4, i32 4, i1 false)
> ret void
> }
>
> In both functions the memset could also take care of the store to the first
> element, rather than starting from the second element. However maybe merging
> the store and the memset should be a job for a different pass.
It's likely that this should be done by memcpy opt, which has logic for merging multiple consecutive stores into a memset.
> Secondly, notice that the size of the memset in the second function is -4.
> Hopefully this will work correctly (i.e. memset 2^32-4 values)! It did make
> me wonder what happens if the loop stores say 2^32 or 2^48 values. Presumably
> the type of the memset size argument is automagically set to a size that can
> hold the loop trip count...
I don't understand what you're saying here... is there a miscompilation, or was the original code doing this large store? -4 is "a big 32-bit number" and is zero extended to i64 on 64-bit targets.
-Chris
More information about the llvm-commits
mailing list