[PATCH] improve vectorizers by removing cost of unnecessary truncs and exts.

Mon May 11 08:33:48 PDT 2015

Hi Sam,

It's a mismodelling, where you'll say something is cheaper than it actually
is.

Cheers,

James

On Mon, 11 May 2015 at 15:46 Sam Parker <sam.parker at arm.com> wrote:

> Hi James,
>
>
>
> By not totally sound, do you mean incorrect or that I’m just missing an
> extra opportunity and I should check that extensions aren’t targeting a
> larger than necessary bitwidth? As for the test file, I didn’t know if I
> was supposed to add it along with the patch or not, and will attach it in
> the next version.
>
>
>
> Cheers,
>
> Sam
>
>
>
>
>
> *From:* James Molloy [mailto:james at jamesmolloy.co.uk]
> *Sent:* 11 May 2015 13:49
> *To:* Demikhovsky, Elena; Sam Parker
> *Cc:* llvm-commits at cs.uiuc.edu
> *Subject:* Re: [PATCH] improve vectorizers by removing cost of
> unnecessary truncs and exts.
>
>
>
> Hi Sam,
>
> Thanks for working on this! This has been a running sore in ARM
> vectorization for a long time now.
>
>
>
> Initial comments:
>
>    * In the future, could you please use Phabricator (
> http://reviews.llvm.org) to upload patches to; it makes it much easier to
> review.
>
>    * There are no regression tests included in this patch - did you forget
> to git add them?
>
>    * I'm not sure the logic is totally sound. Consider this:
>
>
>
> loop:
>
>   %1 = load i8* %foo
>
>   %2 = zext i8 %1 to i32
>
>   %3 = add i32 %2, 42
>
>   %4 = trunc i32 %3 to i16
>
>   %5 = add i16 %4, 42
>
>   store i16* %5, %bar
>
>   br loop
>
>
>
> getWidestType() will return i16. So the first cast is not free, but
> neither is it a cast to i32, it's a cast to i16.
>
>
>
> I'm not exactly sure what Elena's query was; it looks like the
> implementation here should be architecture-agnostic as it's just modelling
> changes to the IR (truncate nodes disappear) and the rest is the TTI's
> responsibility.
>
>
>
> Cheers,
>
>
>
> James
>
>
>
> On Mon, 11 May 2015 at 12:17 Demikhovsky, Elena <
> elena.demikhovsky at intel.com> wrote:
>
> +        if (Opcode == Instruction::Trunc) {
>
> +          if (TTI->isTypeLegal(DstVecTy)) {
>
> +            VecCost = 0;
>
> +          }
>
>
>
> On AVX-512 the “truncate” is usually one instruction, the VecCost should
> be 1.
>
> On AVX the type may be legal, but “truncate” is more than one instruction.
>
>
>
> -          * Elena*
>
>
>
> *From:* llvm-commits-bounces at cs.uiuc.edu [mailto:
> llvm-commits-bounces at cs.uiuc.edu] *On Behalf Of *Sam Parker
> *Sent:* Monday, May 11, 2015 13:57
> *To:* llvm-commits at cs.uiuc.edu
> *Subject:* [PATCH] improve vectorizers by removing cost of unnecessary
> truncs and exts.
>
>
>
> Hi,
>
>
>
> I’ve attached a patch to both the loop vectorizer and slp-vectorizer which
> checks to see whether truncs and extensions would actually be required if
> the code was vectorized. This is so that the vectorizers understand that
> the cost of these instructions is effectively zero if vectorization
> happens. This is helpful when working on smaller data types, such as i8 and
> i16, that do not have native support in general purpose registers, but are
> supported in vector register files.
>
>
>
> Regards,
>
> Sam
>
>
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150511/403bac07/attachment.html>