[PATCH] D103629: [AArch64] Cost-model i8 vector loads/stores

Fri Jun 4 09:51:05 PDT 2021

efriedma added a comment.

The two-instruction sequence leaves the bits in the right positions for a `<4 x i16>`.  If you need a `<4 x i32>`, you need another zip.  If you need sign-extension, you need to sshr the result or something like that.  So "%x = load <4 x i8>, <4 x i8>* %a  %y = sext <4 x i8> %x to <4 x i32>" would be four instructions total.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103629/new/

https://reviews.llvm.org/D103629