[PATCH] D129734: [InstCombine] Canonicalize GEP of GEP by swapping constant-indexed GEP to the front

Thu Jul 14 17:15:45 PDT 2022

huangjd added a comment.

In D129734#3650863 <https://reviews.llvm.org/D129734#3650863>, @nikic wrote:

> I think this will do worse in practice, because it breaks GVN. Imagine you have accesses like `ary[i].x`, `ary[i].y`, `ary[i].z` represented with two GEPs. If constants are canonicalized to the end, then the `ary[i]` GEP is the same all three times and can be CSEd/GVNd. If the constants are canonicalized to the start, then these all become distinct GEPs on different bases.

Expressions like ary[i].x will always be merged in visitGEPofGEP into `getelementptr type, ptr ary, i64 i, i32 0` even without my patch. In fact almost all valid C++ array/member access expression will be coalesced into one multi-index GEP, unless you are casting the pointer into another type that it actually isn't, and try to do more pointer arithmetic, which is probably intentional UB anyways.

If that breaks GVN then I think GEP transform pass should be pushed back so that CSE happens first. In the other hand, CSE should actually not apply to accesses in your case, because in most architecture load instruction can take a constant offset, so it wouldn't be necessary to compute the intermediate address of ary[i].

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129734/new/

https://reviews.llvm.org/D129734