[PATCH] D48332: [AArch64] Add custom lowering for v4i8 trunc store

Thu Jun 21 11:27:07 PDT 2018

efriedma added inline comments.

================
Comment at: lib/Target/AArch64/AArch64TargetTransformInfo.cpp:639
   if (Ty->isVectorTy() && Ty->getVectorElementType()->isIntegerTy(8) &&
-      Ty->getVectorNumElements() < 8) {
-    // We scalarize the loads/stores because there is not v.4b register and we
-    // have to promote the elements to v.4h.
+      Ty->getVectorNumElements() < 4) {
+    // We scalarize the loads/stores because there is not v.2b register and we
----------------
efriedma wrote:
> It looks like we still scalarize extloads?
I'm still not sure this change belongs in this patch, given that we still scalarize `<4 x i8>` loads.

================
Comment at: test/CodeGen/AArch64/neon-truncStore-extLoad.ll:27
+; CHECK: xtn [[TMP2:(v[0-9]+)]].8b, [[TMP]].8h
+; CHECK: {{st1 { [[TMP2]].4h }[0]|str s[0-9]+}}, [x{{[0-9]+|sp}}]
+  %b = trunc <4 x i32> %a to <4 x i8>
----------------
zatrazz wrote:
> rengolin wrote:
> > efriedma wrote:
> > > Why does this CHECK line have two possible lowerings?
> > looks like it's a copy of the pattern around... weird...
> > 
> > What's the instruction actually generated?
> Indeed I used the previous function as example (truncStore.v4i32), and for current testing the instruction being generated is 'str'.
Please get rid of the "|" in all the patterns in this file.

https://reviews.llvm.org/D48332