[llvm-dev] Global removal pass - potential for improvement?
Doerfert, Johannes via llvm-dev
llvm-dev at lists.llvm.org
Tue Jan 28 10:23:18 PST 2020
On 01/28, Karl Rehm wrote:
> >
> > I need to take a closer look but I would have expected BasicAA to be
> > able to determine that `do_log` and `R` cannot alias. In the -Os version
> > (lower right here https://gcc.godbolt.org/z/KLaeiH), the write to `R`
> > clobbers the read from `do_log` which prevents us from removing the
> > load/store pair. My reasoning would have been that we know the size of
> > `do_log` to be less than the size accessed via `R`. What exactly goes
> > wrong or if my logic is flawed needs to be examined. I would start
> > looking at the debug generated by the code parts touched here:
> > https://reviews.llvm.org/D66157
> >
>
> Well, if I recall correctly there's actually a problem with AA being *too*
> aggressive for this reason: https://gcc.godbolt.org/z/CXN8Gw
> There's a bug report about this somewhere, don't remember where though.
That seems to be related purely to TBAA: https://gcc.godbolt.org/z/ZLSzHH
What I described should fire if you see the allocation. Alternatively,
we can add a new "max object size" attribute [0] which gives a size
upper bound for the pointed to memory. In either case, if you know the
object pointed to by A is maximal N bytes of size, and the object
pointed to by B is at most M bytes of size, and N < M, A and B do not
alias. As of now, we use `dereferenceable` to determine the minimum size
of the underlying object* and require to see the definition (I think) to
obtain a maximum size.
* It's actually a lower bound on the minimum size based on the pointer
but that is conservatively fine too.
[0] https://www.youtube.com/watch?v=HVvvCSSLiTw
> On Tue, Jan 28, 2020 at 12:44 PM Doerfert, Johannes <jdoerfert at anl.gov>
> wrote:
>
> > Hi Karl, Roman,
> >
> > On 01/28, Roman Lebedev via llvm-dev wrote:
> > > On Tue, Jan 28, 2020 at 8:09 PM Karl Rehm via llvm-dev
> > > <llvm-dev at lists.llvm.org> wrote:
> > > > I was looking into how the global optimization pass fares against
> > > > things like what's reported in
> > > > https://bugs.llvm.org/show_bug.cgi?id=44676
> >
> > I need to take a closer look but I would have expected BasicAA to be
> > able to determine that `do_log` and `R` cannot alias. In the -Os version
> > (lower right here https://gcc.godbolt.org/z/KLaeiH), the write to `R`
> > clobbers the read from `do_log` which prevents us from removing the
> > load/store pair. My reasoning would have been that we know the size of
> > `do_log` to be less than the size accessed via `R`. What exactly goes
> > wrong or if my logic is flawed needs to be examined. I would start
> > looking at the debug generated by the code parts touched here:
> > https://reviews.llvm.org/D66157
> >
> >
> > > > Looking at this, I think it would be pretty trivial to optimize that
> > > > down given that there are already threading assumptions made:
> > > > https://godbolt.org/z/u6ZqoB
> >
> > Optimizing more aggressively based on forward process guarantees will
> > get us in more trouble than we are already in. I don't have the link
> > handy but as far as I remember the proposed solution was to have a
> > forward process guarantee function attribute. I would recommend we look
> > into that first before we start more aggressive optimizations which will
> > cause problems for a lot of (non C/C++) folks.
> >
> >
> > > > Is this something I can look into?
> >
> > Sure :)
> >
> >
> > > > Another thing is that currently *all* external calls break this
> > > > optimization, including calls to intrinsics that probably shouldn't:
> > > > https://godbolt.org/z/pK7Cew
> > > I think during load propagation, there is a legality check "here's a
> > > load, and here's a store.
> > > Is there anything in between that may have clobbered that memory
> > location?".
> >
> > Right now we only have `__attribute__((pure/const))` but we want to
> > expose all LLVM-IR attributes to the user soon [0] which will allow way
> > more fine-grained control. Intrinsics are a different story again.
> >
> >
> > > For calls, there are some attributes that are helpful here:
> > > https://llvm.org/docs/LangRef.html#function-attributes
> > > So in this case, i guess `@llvm.x86.flags.write` intrinsic maybe can
> > > be annotated with readonly attribute,
> > > thus signalling that it won't clobber that memory location?
> >
> > While target specific intrinsics are a bit more complicated we see the
> > problem often with generic intrinsic already. We proposed the other day
> > [1] to change the default semantics of non-target specific intrinsics
> > such that you have to opt-in for certain effects.
> >
> > For the above example you want `llvm.x86.flags.write` to be `writeonly` and
> > `inaccesiblememonly`. Also `nosync`, `willreturn`, ...
> >
> > Cheers,
> > Johannes
> >
> >
> > [0] https://www.youtube.com/watch?v=elmio6AoyK0
> > [1] http://lists.llvm.org/pipermail/llvm-dev/2019-August/134404.html
> >
--
Johannes Doerfert
Researcher
Argonne National Laboratory
Lemont, IL 60439, USA
jdoerfert at anl.gov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200128/bf0be1e5/attachment.sig>
More information about the llvm-dev
mailing list