[PATCH] D142705: [GVN] Support address translation through select instructions

Tue Feb 7 02:11:01 PST 2023

kachkov98 marked an inline comment as not done.
kachkov98 added a comment.

I want to put more context for this change that may cover some questions regarding the algorithm and its limitations. Let's consider the following simple case:

    BB1
    / |
  BB2 |
    \ |
    BB3

Where BB3 contains load from address `P`, that translates to address `P1` for BB1->BB3 edge and to address `P2` for BB2->BB3 edge. For simplicity, let's assume there are no any clobbers in between. If BB1 has Defs for `P1` and `P2` (e.g. loads from these pointers), this is a case of full redundancy: we can insert a phi in BB3 (`P1` Def for BB1->BB3 and `P2` Def for BB2->BB3). However, current design of MemoryDependenceAnalysis can't handle it as explained in this comment <https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp#L1304>. The reason is following: MemoryDependenceAnalysis goes back from BB3 block, trying to find dependency of load `P`. Moving from BB3 to BB2, it translates address to `P2`, moving from BB2 to BB1 address doesn't change, and in BB1 it detects Def of `P2`. Then it starts to explore other paths from BB3 (basically it's a backward DFS), and goes from BB3 to BB1, translating address to `P1`. After that we have a collision: BB1 was already visited with `P2` address, and since addresses doesn't match, it bails out. To process this case, it should be able to propagate back multiple addresses at once: in BB1, address of load `P` may be actually load from `P1` or `P2`, so now we should find both variants. However, we need to be careful here: number of such variants can grow exponentially after each basic block where paths with different addresses are met.

On practice, the important special case of previous example is select instruction: If BB2 is empty, SimplifyCFG transforms it to select. It has the same problem: we need to propagate back 2 addresses at once, and number of variants can also grow exponentially, e.g.:

  %a = select %c1, %i1, %i2
  %b = select %c2, %i3, %i4
  %sum = add %a, %b
  %P = GEP %ptr, %sum 

(Here we have 4 variants of %P).
So the first made choice is to translate address only for one select condition, so we will always have only 2 addrs to propagate back. The next question is how to choose this condition: in theory, we can sort out all select conditions and try to find "best" one, but it looks like too much complication wthout any real improvements: on practice load addresses depends on one select at most, so this implementation just picks up the first found select. Current GVN implementation already has support of some cases where load address depends from select directly (it was refactored in https://reviews.llvm.org/D141619), and this patch tries to extend it for cases where address depends from select through other instructions (GEPs and casts). For this approach MemoryDependenceAnalysis still needs to report Def dependency from Select instruction, and this dependency instruction will be used to rematerialize available value in GVN (we need only its condition and position to insert materialized value, so reporting `select i1 %cond, i32 undef, i32 undef` is ok: we don't care about its true and false values, addresses for %cond = true and %cond = false will be passed separately).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142705/new/

https://reviews.llvm.org/D142705