[PATCH] improve vectorizers by removing cost of unnecessary truncs and exts.

Mon May 11 05:48:57 PDT 2015

Hi Sam,

Thanks for working on this! This has been a running sore in ARM
vectorization for a long time now.

Initial comments:
   * In the future, could you please use Phabricator (
http://reviews.llvm.org) to upload patches to; it makes it much easier to
review.
   * There are no regression tests included in this patch - did you forget
to git add them?
   * I'm not sure the logic is totally sound. Consider this:

loop:
  %1 = load i8* %foo
  %2 = zext i8 %1 to i32
  %3 = add i32 %2, 42
  %4 = trunc i32 %3 to i16
  %5 = add i16 %4, 42
  store i16* %5, %bar
  br loop

getWidestType() will return i16. So the first cast is not free, but neither
is it a cast to i32, it's a cast to i16.

I'm not exactly sure what Elena's query was; it looks like the
implementation here should be architecture-agnostic as it's just modelling
changes to the IR (truncate nodes disappear) and the rest is the TTI's
responsibility.

Cheers,

James

On Mon, 11 May 2015 at 12:17 Demikhovsky, Elena <elena.demikhovsky at intel.com>
wrote:

>  +        if (Opcode == Instruction::Trunc) {
>
> +          if (TTI->isTypeLegal(DstVecTy)) {
>
> +            VecCost = 0;
>
> +          }
>
>
>
> On AVX-512 the “truncate” is usually one instruction, the VecCost should
> be 1.
>
> On AVX the type may be legal, but “truncate” is more than one instruction.
>
>
>
> -          * Elena*
>
>
>
> *From:* llvm-commits-bounces at cs.uiuc.edu [mailto:
> llvm-commits-bounces at cs.uiuc.edu] *On Behalf Of *Sam Parker
> *Sent:* Monday, May 11, 2015 13:57
> *To:* llvm-commits at cs.uiuc.edu
> *Subject:* [PATCH] improve vectorizers by removing cost of unnecessary
> truncs and exts.
>
>
>
> Hi,
>
>
>
> I’ve attached a patch to both the loop vectorizer and slp-vectorizer which
> checks to see whether truncs and extensions would actually be required if
> the code was vectorized. This is so that the vectorizers understand that
> the cost of these instructions is effectively zero if vectorization
> happens. This is helpful when working on smaller data types, such as i8 and
> i16, that do not have native support in general purpose registers, but are
> supported in vector register files.
>
>
>
> Regards,
>
> Sam
>
>
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150511/3076e3bc/attachment.html>