[llvm-dev] Reordering of load/stores using MemorySSA

Mon Oct 1 21:38:15 PDT 2018

On Mon, Oct 1, 2018 at 9:03 PM Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr> wrote:
>
>
> Hello all,
>
> It seems that it is insufficient to
> rely on MemorySSA to correctly reorder load/stores.

It depends on what you are trying to do.

>
> For example, we have this following code:
>
> v = load q
> store 10, p
>
> =>
>
> store 10, p
> v = load q // This is incorrect if p and q may alias
>
> In the MemorySSA representation however,
> there's no syntactic dependency between load/store,
> because MemorySSA before transformation would look like this:
>
> USE(M0) // corresponding to `v = load q`
> M1 = DEF(M0) // corresponding to `store 10, p`

This is correct, MemorySSA, like SSA, represents upwards dependencies
and not downwards ones.

Much like there is SSU and SSI, and SSA/SSU/SSI on the reverse flow
graph, there are analogues for MemorySSA that we do not implement.

>
> Reordering these two nodes does not break syntactic dependency,
This is true in regular SSA as well :)
It just happens that memory is different due to aliasing.

> hence additional flow-sensitive analysis is needed to check whether reordering is valid.

> To my understanding, GVNHoist seems to do that:
> Memory operations in basic blocks are checked by
> GVNHoist::hasMemoryUse().
>
Correct.

>
> If USE nodes of MemorySSA also have left-hand side, we can correctly
> represent this kind of reorderability IMHO.
>
> For the example above:
>
> U1 = USE(M0) // corresponding to `v = load q`
> M1 = DEF(M0, U1) // corresponding to `store 10, p`

It is not this simple.
You also need to avoid multiple reaching uses for a single def.
IE
U0 = Use(M0) // corresponding to `v = load q`
U1 = Use(M1) // corresponding to `t = load r` where r aliases q aliases p
M1 = DEF(M0, ???)

??? must be U0, U1 above.

If you have the analogue of phis, you will require that analogue to
merge the multiple reaching uses, and you may have multiple merges per
block.
IE
U2 = merge(U0, U1)
M1 = DEF (M0, U2)

>
> Now there is a syntactic dependency between M1 and U1, hence
> we can syntactically check whether reordering is allowed without any flow-sensitive analysis.

What's the goal?
Much like SSA is meant to speed up certain dataflow problems, so is
this. Something would have to be slow to make this worth it.

MemorySSA in particular is deliberately an overapproximation in LLVM
(and GCC), because we discovered exactness is super-expensive,
impossible, and going as far as you can buys you almost nothing in
practice.
However, it makes for a much faster memorydependenceanalysis (which in
llvm is also only upwards and not downwards).

Here, it's not clear that GVNHoist is slow enough to make building an
entire ReverseMemorySSA form worth it.
In particular, building reverse MemorySSA will double the number of
alias checks in most cases.
You would almost certainly have to fix the BasicAA caching issues and
finally split BasicAA or something to make it viable.
(MemorySSA already pays a high price for BasicAA being super slow).

> If the store and load don't alias, we can remove U1 from DEF:
>
> U1 = USE(M0) // corresponding to `v = load q`
> M1 = DEF(M0) // corresponding to `store 10, p`
>
> Then, the reordering is valid.
>
> One issue is regarding PHI node: possible solution is to make
> two kinds of PHIS - PHIs for DEFs & PHIS for USEs, so there are
> at most 2 PHIs.
>
>
> Does this idea seem to make sense? Any thoughts?
>
It's certainly correct, i have considered doing it before but never
had something to test out whether it was really worth the cost or not.

One previous blocker was that it requires a correct postdomtree to be
able to do placement of use phis.
That should now be solved.

I suspect you will find the cost is not worth it without a meaningful
motivating optimization (like store PRE or something)