[LLVMdev] RFC: PerfGuide for frontend authors

Björn Steinbrink bsteinbr at gmail.com
Sun Mar 1 07:53:09 PST 2015


On 2015.02.28 18:17:27 -0800, Philip Reames wrote:
> > On Feb 28, 2015, at 3:01 PM, Björn Steinbrink <bsteinbr at gmail.com> wrote:
> > 2015-02-28 23:50 GMT+01:00 Philip Reames <listmail at philipreames.com>:
> >>>> On Feb 28, 2015, at 2:30 PM, Björn Steinbrink <bsteinbr at gmail.com> wrote:
> >>> I should have clarified that that was a reduced, incomplete example, the
> >>> actual code looks like this (after optimizations):
> >>> 
> >>> define void @_ZN9test_func20hdd8a534ccbedd903paaE(i1 zeroext) unnamed_addr #0 {
> >>> entry-block:
> >>>   %x = alloca [100000 x i32], align 4
> >>>   %1 = bitcast [100000 x i32]* %x to i8*
> >>>   %arg = alloca [100000 x i32], align 4
> >>>   call void @llvm.lifetime.start(i64 400000, i8* %1)
> >>>   call void @llvm.memset.p0i8.i64(i8* %1, i8 0, i64 400000, i32 4, i1 false)
> >>>   %2 = bitcast [100000 x i32]* %arg to i8*
> >>>   call void @llvm.lifetime.start(i64 400000, i8* %2) ; this happens too late
> >>>   call void @llvm.memcpy.p0i8.p0i8.i64(i8* %2, i8* %1, i64 400000, i32 4, i1 false)
> >>>   call void asm "", "r,~{dirflag},~{fpsr},~{flags}"([100000 x i32]* %arg) #2, !noalias !0, !srcloc !3
> >>>   call void @llvm.lifetime.end(i64 400000, i8* %2) #2, !alias.scope !4, !noalias !0
> >>>   call void @llvm.lifetime.end(i64 400000, i8* %2)
> >>>   call void @llvm.lifetime.end(i64 400000, i8* %1)
> >>>   ret void
> >>> }
> >>> 
> >>> If the lifetime start for %arg is moved up, before the memset, the
> >>> callslot optimization can take place and the %c alloca is eliminated,
> >>> but with the lifetime starting after the memset, that isn't possible.
> >> This bit of ir actually seems pretty reasonable given the inline asm.  The only thing I really see is that the memcpy could be a memset.  Are you expecting something else?
> > 
> > The only thing that is to be improved is that the memset should
> > directly write to %arg and %x should be removed because it is dead
> > then. This happens when there are no lifetime intrinsics or when the
> > call to lifetime.start is moved before the call to memset. The latter
> > is what my first mail was about, that it is usually better to have
> > overlapping lifetimes all start at the same point, instead of starting
> > them as late as possible.
> Honestly, this sounds like a clear optimizer bug, not something a frontend should work around.
> 
> Can you file a bug with the four sets of ir?  (Both schedules, no intrinsics before and after). This should hopefully be easy to fix.  

I went ahead and made a fix: http://reviews.llvm.org/D7984

> Do you know of other cases like this with the lifetime intrinsics?

Not offhand, but I didn't check the IR closely for such issues. The
memcpy thing was found by chance, too.

Björn




More information about the llvm-dev mailing list