<div dir="ltr">

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

I need to take a closer look but I would have expected BasicAA to be<br>

able to determine that `do_log` and `R` cannot alias. In the -Os version<br>

(lower right here <a href="https://gcc.godbolt.org/z/KLaeiH" rel="noreferrer" target="_blank">https://gcc.godbolt.org/z/KLaeiH</a>), the write to `R`<br>

clobbers the read from `do_log` which prevents us from removing the<br>

load/store pair. My reasoning would have been that we know the size of<br>

`do_log` to be less than the size accessed via `R`. What exactly goes<br>

wrong or if my logic is flawed needs to be examined. I would start<br>

looking at the debug generated by the code parts touched here:<br>

<div><a href="https://reviews.llvm.org/D66157" rel="noreferrer" target="_blank">https://reviews.llvm.org/D66157</a></div></blockquote><div><br></div><div>Well, if I recall correctly there's actually a problem with AA being *too* aggressive for this reason: <a href="https://gcc.godbolt.org/z/CXN8Gw">https://gcc.godbolt.org/z/CXN8Gw</a></div><div>There's a bug report about this somewhere, don't remember where though.<br></div><div><br></div>


<br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jan 28, 2020 at 12:44 PM Doerfert, Johannes <<a href="mailto:jdoerfert@anl.gov" target="_blank">jdoerfert@anl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Karl, Roman,<br>

<br>

On 01/28, Roman Lebedev via llvm-dev wrote:<br>

> On Tue, Jan 28, 2020 at 8:09 PM Karl Rehm via llvm-dev<br>

> <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br>

> > I was looking into how the global optimization pass fares against<br>

> > things like what's reported in<br>

> > <a href="https://bugs.llvm.org/show_bug.cgi?id=44676" rel="noreferrer" target="_blank">https://bugs.llvm.org/show_bug.cgi?id=44676</a><br>

<br>

I need to take a closer look but I would have expected BasicAA to be<br>

able to determine that `do_log` and `R` cannot alias. In the -Os version<br>

(lower right here <a href="https://gcc.godbolt.org/z/KLaeiH" rel="noreferrer" target="_blank">https://gcc.godbolt.org/z/KLaeiH</a>), the write to `R`<br>

clobbers the read from `do_log` which prevents us from removing the<br>

load/store pair. My reasoning would have been that we know the size of<br>

`do_log` to be less than the size accessed via `R`. What exactly goes<br>

wrong or if my logic is flawed needs to be examined. I would start<br>

looking at the debug generated by the code parts touched here:<br>

<a href="https://reviews.llvm.org/D66157" rel="noreferrer" target="_blank">https://reviews.llvm.org/D66157</a><br>

<br>

<br>

> > Looking at this, I think it would be pretty trivial to optimize that<br>

> > down given that there are already threading assumptions made:<br>

> > <a href="https://godbolt.org/z/u6ZqoB" rel="noreferrer" target="_blank">https://godbolt.org/z/u6ZqoB</a><br>

<br>

Optimizing more aggressively based on forward process guarantees will<br>

get us in more trouble than we are already in. I don't have the link<br>

handy but as far as I remember the proposed solution was to have a<br>

forward process guarantee function attribute. I would recommend we look<br>

into that first before we start more aggressive optimizations which will<br>

cause problems for a lot of (non C/C++) folks.<br>

<br>

<br>

> > Is this something I can look into?<br>

<br>

Sure :)<br>

<br>

<br>

> > Another thing is that currently *all* external calls break this<br>

> > optimization, including calls to intrinsics that probably shouldn't:<br>

> > <a href="https://godbolt.org/z/pK7Cew" rel="noreferrer" target="_blank">https://godbolt.org/z/pK7Cew</a><br>

> I think during load propagation, there is a legality check "here's a<br>

> load, and here's a store.<br>

> Is there anything in between that may have clobbered that memory location?".<br>

<br>

Right now we only have `__attribute__((pure/const))` but we want to<br>

expose all LLVM-IR attributes to the user soon [0] which will allow way<br>

more fine-grained control. Intrinsics are a different story again.<br>

<br>

<br>

> For calls, there are some attributes that are helpful here:<br>

> <a href="https://llvm.org/docs/LangRef.html#function-attributes" rel="noreferrer" target="_blank">https://llvm.org/docs/LangRef.html#function-attributes</a><br>

> So in this case, i guess `@llvm.x86.flags.write` intrinsic maybe can<br>

> be annotated with readonly attribute,<br>

> thus signalling that it won't clobber that memory location?<br>

<br>

While target specific intrinsics are a bit more complicated we see the<br>

problem often with generic intrinsic already. We proposed the other day<br>

[1] to change the default semantics of non-target specific intrinsics<br>

such that you have to opt-in for certain effects.<br>

<br>

For the above example you want `llvm.x86.flags.write` to be `writeonly` and<br>

`inaccesiblememonly`. Also `nosync`, `willreturn`, ...<br>

<br>

Cheers,<br>

  Johannes<br>

<br>

<br>

[0] <a href="https://www.youtube.com/watch?v=elmio6AoyK0" rel="noreferrer" target="_blank">https://www.youtube.com/watch?v=elmio6AoyK0</a><br>

[1] <a href="http://lists.llvm.org/pipermail/llvm-dev/2019-August/134404.html" rel="noreferrer" target="_blank">http://lists.llvm.org/pipermail/llvm-dev/2019-August/134404.html</a><br>

</blockquote></div></div>