[PATCH] D74185: Revert the revert of vectorization commits

Fri Feb 7 15:20:05 PST 2020

ABataev added a comment.

In D74185#1865072 <https://reviews.llvm.org/D74185#1865072>, @george.karpenkov wrote:

> I did manage to reduce the test case further.
>
> For input IR: https://gist.github.com/cheshire/17067d5ba4781817861c8b21d15c928d
>
> Bad optimized version: https://gist.github.com/cheshire/bf1047b4385bcf82c22a70f5cf1fb5df
>
> Good optimized version: https://gist.github.com/cheshire/8bea1f36ab849f8945bc190b519272a6
>
> The compilation comes from XLA <https://www.tensorflow.org/xla/operation_semantics> test case which looks like this:
>
>   HloModule EntryModule
>  
>   ENTRY EntryModule {
>     %input0 = f64[] parameter(0)
>     %sign_227 = f64[] sign(f64[] %input0)
>     %multiply_235 = f64[] multiply(f64[] %sign_227, f64[] %sign_227)
>  
>     %p4 = f64[2,2] broadcast(%input0), dimensions={}
>     %dot_81 = f64[2,2] dot(f64[2,2] %p4, f64[2,2] %p4), lhs_contracting_dims={1}, rhs_contracting_dims={1}
>     %br2 = f64[2,2] broadcast(f64[] %multiply_235), dimensions={}
>     %reshape_294 = f64[2,2] multiply(f64[2,2] %dot_81, f64[2,2] br2)
>  
>     %broadcast_298 = f64[2,3,2,3] broadcast(f64[2,2] %reshape_294), dimensions={0,2}
>  
>     %arg7_8 = f64[3,3] parameter(1)
>     %broadcast_300 = f64[2,3,2,3] broadcast(f64[3,3] %arg7_8), dimensions={1,3}
>     %multiply_301 = f64[2,3,2,3] multiply(f64[2,3,2,3] %broadcast_298, f64[2,3,2,3] %broadcast_300)
>  
>     %reshape_302 = f64[6,6] reshape(f64[2,3,2,3] %multiply_301)
>     %zero = f64[] constant(0)
>     %zeros = f64[6,6] broadcast(f64[] %zero), dimensions={}
>  
>     %diag = pred[6,6] constant({{1,0,0,0,0,0}, {0,1,0,0,0,0}, {0,0,1,0,0,0}, {0,0,0,1,0,0}, {0,0,0,0,1,0}, {0,0,0,0,0,1}})
>     ROOT %select_316 = f64[6,6] select(pred[6,6] %diag, f64[6,6] %reshape_302, f64[6,6] %zeros)
>   }
>
>
> Essentially, it performs some element-wise multiplications on random input floats, and then replaces all non-diagonal entries with zeros.
>
> Difference in output:
>
>   Expected literal:
>   f64[6,6] {
>     { 0.000275566374291495, 0, 0, 0, 0, 0 },
>     { 0, 0.00040918878254846619, 0, 0, 0, 0 },
>     { 0, 0, -0.00058600272184509581, 0, 0, 0 },
>     { 0, 0, 0, 0.000275566374291495, 0, 0 },
>     { 0, 0, 0, 0, 0.00040918878254846619, 0 },
>     { 0, 0, 0, 0, 0, -0.00058600272184509581 }
>   }
>  
>   Actual literal:
>   f64[6,6] {
>     { 0.000275566374291495, 0, 0, 0, 0, 0 },
>     { 0, 1, 0, 0, 0, 0 },
>     { 0, 0, -0.00058600272184509581, 0, 0, 0 },
>     { 0, 0, 0, 0.000275566374291495, 0, 0 },
>     { 0, 0, 0, 0, 0.00040918878254846619, 0 },
>     { 0, 0, 0, 0, 0, -0.00058600272184509581 }
>   }
>   1/1 runs miscompared.
>  
>
>
> This is not something which can be caused by fast math: a number created by elementwise multiplication of random input floats is exactly "1" in the bad version.

Thanks, will check this.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74185/new/

https://reviews.llvm.org/D74185