[llvm-dev] Global removal pass - potential for improvement?

Tue Jan 28 10:42:33 PST 2020

I'm thinking about this a bit more, and I don't know if it's something that
can be used in a generic backend like LLVM. Should accessing a smaller
object from a bigger pointer be considered UB?
In contrast, sing the assumption that bigger types can not be dereferenced
from smaller types is a good idea. If this is already added to BasicAA, I
wonder why it does not optimize this out?

On Tue, Jan 28, 2020 at 1:23 PM Doerfert, Johannes <jdoerfert at anl.gov>
wrote:

> On 01/28, Karl Rehm wrote:
> > >
> > > I need to take a closer look but I would have expected BasicAA to be
> > > able to determine that `do_log` and `R` cannot alias. In the -Os
> version
> > > (lower right here https://gcc.godbolt.org/z/KLaeiH), the write to `R`
> > > clobbers the read from `do_log` which prevents us from removing the
> > > load/store pair. My reasoning would have been that we know the size of
> > > `do_log` to be less than the size accessed via `R`. What exactly goes
> > > wrong or if my logic is flawed needs to be examined. I would start
> > > looking at the debug generated by the code parts touched here:
> > > https://reviews.llvm.org/D66157
> > >
> >
> > Well, if I recall correctly there's actually a problem with AA being
> *too*
> > aggressive for this reason: https://gcc.godbolt.org/z/CXN8Gw
> > There's a bug report about this somewhere, don't remember where though.
>
> That seems to be related purely to TBAA: https://gcc.godbolt.org/z/ZLSzHH
>
> What I described should fire if you see the allocation. Alternatively,
> we can add a new "max object size" attribute [0] which gives a size
> upper bound for the pointed to memory. In either case, if you know the
> object pointed to by A is maximal N bytes of size, and the object
> pointed to by B is at most M bytes of size, and N < M, A and B do not
> alias. As of now, we use `dereferenceable` to determine the minimum size
> of the underlying object* and require to see the definition (I think) to
> obtain a maximum size.
>
> * It's actually a lower bound on the minimum size based on the pointer
>   but that is conservatively fine too.
>
> [0] https://www.youtube.com/watch?v=HVvvCSSLiTw
>
>
> > On Tue, Jan 28, 2020 at 12:44 PM Doerfert, Johannes <jdoerfert at anl.gov>
> > wrote:
> >
> > > Hi Karl, Roman,
> > >
> > > On 01/28, Roman Lebedev via llvm-dev wrote:
> > > > On Tue, Jan 28, 2020 at 8:09 PM Karl Rehm via llvm-dev
> > > > <llvm-dev at lists.llvm.org> wrote:
> > > > > I was looking into how the global optimization pass fares against
> > > > > things like what's reported in
> > > > > https://bugs.llvm.org/show_bug.cgi?id=44676
> > >
> > > I need to take a closer look but I would have expected BasicAA to be
> > > able to determine that `do_log` and `R` cannot alias. In the -Os
> version
> > > (lower right here https://gcc.godbolt.org/z/KLaeiH), the write to `R`
> > > clobbers the read from `do_log` which prevents us from removing the
> > > load/store pair. My reasoning would have been that we know the size of
> > > `do_log` to be less than the size accessed via `R`. What exactly goes
> > > wrong or if my logic is flawed needs to be examined. I would start
> > > looking at the debug generated by the code parts touched here:
> > > https://reviews.llvm.org/D66157
> > >
> > >
> > > > > Looking at this, I think it would be pretty trivial to optimize
> that
> > > > > down given that there are already threading assumptions made:
> > > > > https://godbolt.org/z/u6ZqoB
> > >
> > > Optimizing more aggressively based on forward process guarantees will
> > > get us in more trouble than we are already in. I don't have the link
> > > handy but as far as I remember the proposed solution was to have a
> > > forward process guarantee function attribute. I would recommend we look
> > > into that first before we start more aggressive optimizations which
> will
> > > cause problems for a lot of (non C/C++) folks.
> > >
> > >
> > > > > Is this something I can look into?
> > >
> > > Sure :)
> > >
> > >
> > > > > Another thing is that currently *all* external calls break this
> > > > > optimization, including calls to intrinsics that probably
> shouldn't:
> > > > > https://godbolt.org/z/pK7Cew
> > > > I think during load propagation, there is a legality check "here's a
> > > > load, and here's a store.
> > > > Is there anything in between that may have clobbered that memory
> > > location?".
> > >
> > > Right now we only have `__attribute__((pure/const))` but we want to
> > > expose all LLVM-IR attributes to the user soon [0] which will allow way
> > > more fine-grained control. Intrinsics are a different story again.
> > >
> > >
> > > > For calls, there are some attributes that are helpful here:
> > > > https://llvm.org/docs/LangRef.html#function-attributes
> > > > So in this case, i guess `@llvm.x86.flags.write` intrinsic maybe can
> > > > be annotated with readonly attribute,
> > > > thus signalling that it won't clobber that memory location?
> > >
> > > While target specific intrinsics are a bit more complicated we see the
> > > problem often with generic intrinsic already. We proposed the other day
> > > [1] to change the default semantics of non-target specific intrinsics
> > > such that you have to opt-in for certain effects.
> > >
> > > For the above example you want `llvm.x86.flags.write` to be
> `writeonly` and
> > > `inaccesiblememonly`. Also `nosync`, `willreturn`, ...
> > >
> > > Cheers,
> > >   Johannes
> > >
> > >
> > > [0] https://www.youtube.com/watch?v=elmio6AoyK0
> > > [1] http://lists.llvm.org/pipermail/llvm-dev/2019-August/134404.html
> > >
>
> --
>
> Johannes Doerfert
> Researcher
>
> Argonne National Laboratory
> Lemont, IL 60439, USA
>
> jdoerfert at anl.gov
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200128/b5f496b4/attachment.html>