[PATCH] D70673: [AArch64] Fix over-eager fusing of NEON SIMD MUL/ADD

Thu Dec 5 02:22:18 PST 2019

mstorsjo added a comment.

This broke vectorized code in one case. To repro, fetch https://martin.st/temp/g723.c and compile with `clang -target aarch64-w64-mingw32 -c -O3 g723.c`. The diff in generated code is available at https://martin.st/temp/g723.s.diff.

The key change is this:

  -       mul     v4.2s, v3.2s, v4.2s
  -       mls     v4.2s, v5.2s, v2.2s
  +       mul     v5.2s, v5.2s, v2.2s
  +       mls     v5.2s, v3.2s, v4.2s

Thus it seems the sign is flipped; it used to calculate `v3*v4-v5*v2`, now it calculates `v5*v2-v3*v4`.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70673/new/

https://reviews.llvm.org/D70673