[LLVMdev] Missing InstCombine optimization.

Tue Jun 4 15:38:33 PDT 2013

On Jun 4, 2013, at 6:27 AM, "Bader, Aleksey A" <aleksey.a.bader at intel.com> wrote:

> Hi Jakob,
>  
> I’ve a problem related to the commit #155362.
>  
> Consider the following snippet:
> void bar(float* f) {
> …
> }
> void foo(float* f, int idx) {
> int hi = idx>>3;
> int lo = idx&7;
> bar(&f[hi*8+lo]); // hi*8 + lo == idx
> bar(&f[hi*10+lo]);
> }
>  
> Before 155362 revision InstCombine was able to optimize hi*8+lo to idx by applying following patterns:
> 1.       hi*8 -> hi << 3
> 2.       ((idx >> 3) << 3) -> idx & -8
> 3.       hi*8+lo -> hi*8 | lo
> 4.       (idx & -8) | (idx & 7) -> idx & (-8 | 7) -> idx
>  
> After 155362 pattern #2 is deferred to DAGCombine stage, so InstCombine is unable to apply pattern #4:
>         4*. ((idx >> 3) << 3) | (idx & 7) -> idx // SimplifyOr can’t handle it.

Actually, your own code illustrates the problem with this transformation. Suppose you were using SCEV to analyze the behavior of the to f[] memory references:

  f[hi*8 + lo]
  f[hi*10 + lo]

If you rewrite 'hi*8 + lo' to ‘idx’, the affine relationship between the two memory references is no longer visible, and SCEV will probably tell you that the offsets are unrelated.

The fundamental problem is that there is no such thing as a canonical form of an expression DAG. Some expression graphs, like this one, can take different forms that each enable different analyses. Which one is correct?

I think that the right approach in this case is to preserve the relationships that were already exposed in the original code. That is why pattern 4* is disabled when the inner expression has multiple uses. It preserves the relationships that are expressed through the ‘hi’ variable.

Thanks,
/jakob