[PATCH] D74185: Revert the revert of vectorization commits

Fri Feb 7 15:11:08 PST 2020

george.karpenkov requested review of this revision.
george.karpenkov added a comment.

I did manage to reduce the test case further.

For input IR: https://gist.github.com/cheshire/17067d5ba4781817861c8b21d15c928d

Bad optimized version: https://gist.github.com/cheshire/bf1047b4385bcf82c22a70f5cf1fb5df

Good optimized version: https://gist.github.com/cheshire/8bea1f36ab849f8945bc190b519272a6

The compilation comes from XLA <https://www.tensorflow.org/xla/operation_semantics> test case which looks like this:

  HloModule EntryModule

  ENTRY EntryModule {
    %input0 = f64[] parameter(0)
    %sign_227 = f64[] sign(f64[] %input0)
    %multiply_235 = f64[] multiply(f64[] %sign_227, f64[] %sign_227)

    %p4 = f64[2,2] broadcast(%input0), dimensions={}
    %dot_81 = f64[2,2] dot(f64[2,2] %p4, f64[2,2] %p4), lhs_contracting_dims={1}, rhs_contracting_dims={1}
    %br2 = f64[2,2] broadcast(f64[] %multiply_235), dimensions={}
    %reshape_294 = f64[2,2] multiply(f64[2,2] %dot_81, f64[2,2] br2)

    %broadcast_298 = f64[2,3,2,3] broadcast(f64[2,2] %reshape_294), dimensions={0,2}

    %arg7_8 = f64[3,3] parameter(1)
    %broadcast_300 = f64[2,3,2,3] broadcast(f64[3,3] %arg7_8), dimensions={1,3}
    %multiply_301 = f64[2,3,2,3] multiply(f64[2,3,2,3] %broadcast_298, f64[2,3,2,3] %broadcast_300)

    %reshape_302 = f64[6,6] reshape(f64[2,3,2,3] %multiply_301)
    %zero = f64[] constant(0)
    %zeros = f64[6,6] broadcast(f64[] %zero), dimensions={}

    %diag = pred[6,6] constant({{1,0,0,0,0,0}, {0,1,0,0,0,0}, {0,0,1,0,0,0}, {0,0,0,1,0,0}, {0,0,0,0,1,0}, {0,0,0,0,0,1}})
    ROOT %select_316 = f64[6,6] select(pred[6,6] %diag, f64[6,6] %reshape_302, f64[6,6] %zeros)
  }

Essentially, it performs some element-wise multiplications on random input floats, and then replaces all non-diagonal entries with zeros.

Difference in output:

  Expected literal:
  f64[6,6] {
    { 0.000275566374291495, 0, 0, 0, 0, 0 },
    { 0, 0.00040918878254846619, 0, 0, 0, 0 },
    { 0, 0, -0.00058600272184509581, 0, 0, 0 },
    { 0, 0, 0, 0.000275566374291495, 0, 0 },
    { 0, 0, 0, 0, 0.00040918878254846619, 0 },
    { 0, 0, 0, 0, 0, -0.00058600272184509581 }
  }

  Actual literal:
  f64[6,6] {
    { 0.000275566374291495, 0, 0, 0, 0, 0 },
    { 0, 1, 0, 0, 0, 0 },
    { 0, 0, -0.00058600272184509581, 0, 0, 0 },
    { 0, 0, 0, 0.000275566374291495, 0, 0 },
    { 0, 0, 0, 0, 0.00040918878254846619, 0 },
    { 0, 0, 0, 0, 0, -0.00058600272184509581 }
  }
  1/1 runs miscompared.

This is not something which can be caused by fast math: a number created by elementwise multiplication of random input floats is exactly "1" in the bad version.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74185/new/

https://reviews.llvm.org/D74185