[PATCH] Add basic support for removal of load that are fed by a store of an aggregate

Björn Steinbrink bsteinbr at gmail.com
Fri Jan 2 15:40:39 PST 2015


On 2015.01.02 23:29:00 +0100, Björn Steinbrink wrote:
> [Added llvm-commit back to Cc, sorry, removing it wasn't intentional]
> 
> On 2015.01.02 14:05:28 -0800, Chandler Carruth wrote:
> > On Fri, Jan 2, 2015 at 2:02 PM, Björn Steinbrink <bsteinbr at gmail.com> wrote:
> > 
> > > 2015-01-02 22:36 GMT+01:00 Chandler Carruth <chandlerc at google.com>:
> > > >
> > > > On Fri, Jan 2, 2015 at 1:27 PM, Björn Steinbrink <bsteinbr at gmail.com> wrote:
> > > >>
> > > >> If we have a simple load through a GEP that is fed by a store of an
> > > >> aggregate, we can use the GEP indices to walk the stored aggregate and
> > > >> extract the appropriate value to replace the load.
> > > >
> > > >
> > > > What's the motivation for this change?
> > >
> > > The rust compiler hit a case where not having this optimization caused
> > > a branch not to be removed. The corresponding issue for rustc is
> > > https://github.com/rust-lang/rust/issues/20149
> > >
> > > > Note that SROA replaces all stores of aggregates with scalar stores of
> > > > the components specifically so that neither it nor GVN needs to
> > > > cope with aggregate loads or stores.
> > >
> > > SROA did the opposite thing in this case, replacing individual stores
> > > insertvalues and a store of an FCA. I'm attaching the full output of
> > > opt -print-after-all -O2 -S for the failing test case given in the
> > > rust issue. The first SROA creates the FCA store, and later when the
> > > call to `unwrap` is inlined the existing optimizations can't eliminate
> > > the branch that comes with it, because of the FCA store.
> > 
> > 
> > SROA doesn't create FCA stores in any cases I'm aware of... Can you provide
> > a somewhat reduced test case rather than a massive log?
> 
> I got myself confused because it introduced the insertvalue
> instructions. The FCA store was already present and not constructed by
> SROA. I've attach a small test case this time.

And... I forgot it again :-/ I'm really sorry, apparently this is not my
day.

Björn
-------------- next part --------------
target datalayout = "e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

declare void @foo();
declare void @bar();

declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture readonly, i64, i32, i1) unnamed_addr #1

; Function Attrs: nounwind readnone uwtable
define void @test({ i8 , i8 }*) {
entry-block:
  %v = alloca { i8, i8 }
  %v.1 = getelementptr inbounds { i8, i8 }* %v, i64 0, i32 0
  store i8 1, i8 *%v.1
  %v.2 = getelementptr inbounds { i8, i8 }* %v, i64 0, i32 1
  store i8 2, i8 *%v.2
  %src = bitcast { i8, i8 } *%v to i8*
  %dst = bitcast { i8, i8 } *%0 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %dst, i8* %src, i64 2, i32 1, i1 false)
  call void @brancher({ i8, i8 }* %0)
  ret void
}

define internal void @brancher({ i8, i8 }*) {
  %v.1 = getelementptr inbounds { i8, i8 }* %0, i64 0, i32 0
  %v1 = load i8* %v.1
  %c = icmp eq i8 %v1, 1
  br i1 %c, label %if, label %else
if:
  call void @foo()
  br label %out

else:
  call void @bar()
  br label %out
out:
  ret void
}


More information about the llvm-commits mailing list