[PATCH] D48332: [AArch64] Add custom lowering for v4i8 trunc store

Tue Jun 19 13:10:54 PDT 2018

efriedma added a comment.

I wonder if we should prefer to widen `<2 x i8>` and `<4 x i8>` to `<8 x i8>` instead of promoting to `<4 x i16>`.   It would make stores like this a bit cheaper.  Maybe an interesting experiment at some point (mostly just modifying AArch64TargetLowering::getPreferredVectorAction, I think, and seeing what happens to the generated code).

Do we need similar handling to this patch for `<2 x i16>` or `<2 x i8>`?

================
Comment at: lib/Target/AArch64/AArch64TargetTransformInfo.cpp:639
   if (Ty->isVectorTy() && Ty->getVectorElementType()->isIntegerTy(8) &&
-      Ty->getVectorNumElements() < 8) {
-    // We scalarize the loads/stores because there is not v.4b register and we
-    // have to promote the elements to v.4h.
+      Ty->getVectorNumElements() < 4) {
+    // We scalarize the loads/stores because there is not v.2b register and we
----------------
It looks like we still scalarize extloads?

================
Comment at: test/CodeGen/AArch64/neon-truncStore-extLoad.ll:27
+; CHECK: xtn [[TMP2:(v[0-9]+)]].8b, [[TMP]].8h
+; CHECK: {{st1 { [[TMP2]].4h }[0]|str s[0-9]+}}, [x{{[0-9]+|sp}}]
+  %b = trunc <4 x i32> %a to <4 x i8>
----------------
Why does this CHECK line have two possible lowerings?

Repository:
  rL LLVM

https://reviews.llvm.org/D48332