[PATCH] D124750: [MLIR] Add a utility to sort the operands of commutative ops

Mon May 9 18:19:06 PDT 2022

srishti-pm added a comment.

In D124750#3502295 <https://reviews.llvm.org/D124750#3502295>, @mehdi_amini wrote:

> You're telling me "what" while I'm actually more interested in the "why" here?

I'm not sure what your question is, with a "why". Let me think about this a bit. I'll get back to you.

> Same as before: this does not tell me why, can you provide an example where this matters?

Sure. This is a bit lengthy. I'm really sorry for that !

So, lets start with some basic understanding here. Let's say I am writing a `matchAndRewrite()` function, where I take the following INPUT and convert it to the following OUTPUT:

INPUT:
a = div b, c
d = sub e, f
g = add d, a
h = const 0
i = mul h, g

OUTPUT:
i = some_op b, c, e, f

Now, when I'm writing a C++ code to match and rewrite:

If I only sort the `i = mul h, g` op, I get my canonicalized input as follows:

CANONICALIZED INPUT #1:

a = div b, c
d = sub e, f
g = add d, a
h = const 0
i = mul g, h

So, I'm basically sorting `i = mul h, g` to `i = mul g, h` using the utility and then writing the if-else statements to match the CANONICALIZED INPUT #1.
So, a psuedo code will be:

  if mul.operand[0].defOp != add OR mul.operand[1].defOp != const 0
    return failure

  if add.operand[0].defOp != sub
    if add.operand[0].defOp != div OR add.operand[1].defOp != sub
      return failure
    else
      get the values of b, c, e, and f
  else if add.operand[1].defOp == div
    get the values of b, c, e, and f
  else
    return failure

  rewrite<some_op>(mul, b, c, e, f)

But, if I had sorted the producers as well, my canonicalized input would be:
CANONICALIZED INPUT #2:

a = div b, c
d = sub e, f
g = add a, d
h = const 0
i = mul g, h

and thus my code will reduce to:

  if mul.operand[0].defOp !=  add OR mul.operand[1].defOp != const 0
    return failure

  if add.operand[0].defOp != div OR add.operand[1].defOp != sub
    return failure

  get the values of b, c, e, and f

  rewrite<some_op>(mul, b, c, e, f)

So, in essence, we can see that the effort of an end user writing a C++ pattern has reduced if I sort the producers as well. But, one may argue that I could have sorted the `add` op after seeing it and then my if-else statements would reduce. So, the above illustration doesn't explain why we sort the producers.

The real reason for sorting the producers is that, if such a thing is not done, the sorting and this entire utility will be virtually useless. A deterministic sorting of an op requires its producers to be sorted. Our sorting algorithm is based on the breadth-first traversal of backard slices. One the same level of DAG, the traversal looks at operands from the first to the last. That is how the breadth-first traversal is defined here. Now, if this traversal is non-deterministic, then, the whole use of sorting something will collapse. **Maybe, this can be BEST explained with the below example.**

If I have this IR:
d = div b, c
s = sub e, f
x = xor k, l
g = add s, d
h = add d, x
i = add g, h

Then, `i = add g, h` will be sorted to `i = add g, h` (no change).

**But**, when I have the below IR (which is functionally the same as the above IR):
d = div b, c
s = sub e, f
x = xor k, l
g = add d, s
h = add d, x
i = add g, h

Then, `i = add g, h` will be sorted to `i = add h, g`.

So, we have two functionally same IRs being sorted differently. This is clearly not useful. The sorting depends on what the input IR is. So, even after sorting, we still have functionally same IRs that look different. So, the pattern matcher (a human) still has to write an excessive number of if-else statements to match the input. This is exactly what this sorting was supposed to avoid. This is as good as not having done any sorting at all!

Is the motivation clear now?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124750/new/

https://reviews.llvm.org/D124750