[PATCH] Reducing the costs of cast instructions to enable more vectorization of smaller types in LoopVectorize

Thu May 21 09:17:55 PDT 2015

Hi Sam,

If I understand correctly, this implementation allows for only s/zext -> op -> trunc chains (with only one operation). Is this correct (or perhaps I'm missing something)? I think in theory it should be possible to have arbitrary long chains, depending on what operations are in them.

-Silviu

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4519
@@ +4518,3 @@
+
+        if (ConstInt->uge((1 << ScalarWidth) - 1))
+          break;
----------------
APInt::getMaxValue(ScalarWidth) should be used as (1 << ScalarWidth) - 1 might return different values depending on the host machine.

================
Comment at: test/Transforms/LoopVectorize/AArch64/loop-vectorization-factors.ll:140
@@ +139,3 @@
+
+!0 = !{!"clang version 3.7.0 (http://llvm.org/git/clang.git 93281cbcbb9b1ea2b6788a629d6aef3284957d05) (http://llvm.org/git/llvm.git 40048e70d7a63ab64df3d0e52107f3c0e3472571)"}
+!1 = !{!2, !2, i64 0}
----------------
I think this would be cleaner without the metadata.

http://reviews.llvm.org/D9822

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/