[PATCH] improve vectorizers by removing cost of unnecessary truncs and exts.
James Molloy
james at jamesmolloy.co.uk
Mon May 11 05:48:57 PDT 2015
Hi Sam,
Thanks for working on this! This has been a running sore in ARM
vectorization for a long time now.
Initial comments:
* In the future, could you please use Phabricator (
http://reviews.llvm.org) to upload patches to; it makes it much easier to
review.
* There are no regression tests included in this patch - did you forget
to git add them?
* I'm not sure the logic is totally sound. Consider this:
loop:
%1 = load i8* %foo
%2 = zext i8 %1 to i32
%3 = add i32 %2, 42
%4 = trunc i32 %3 to i16
%5 = add i16 %4, 42
store i16* %5, %bar
br loop
getWidestType() will return i16. So the first cast is not free, but neither
is it a cast to i32, it's a cast to i16.
I'm not exactly sure what Elena's query was; it looks like the
implementation here should be architecture-agnostic as it's just modelling
changes to the IR (truncate nodes disappear) and the rest is the TTI's
responsibility.
Cheers,
James
On Mon, 11 May 2015 at 12:17 Demikhovsky, Elena <elena.demikhovsky at intel.com>
wrote:
> + if (Opcode == Instruction::Trunc) {
>
> + if (TTI->isTypeLegal(DstVecTy)) {
>
> + VecCost = 0;
>
> + }
>
>
>
> On AVX-512 the “truncate” is usually one instruction, the VecCost should
> be 1.
>
> On AVX the type may be legal, but “truncate” is more than one instruction.
>
>
>
> - * Elena*
>
>
>
> *From:* llvm-commits-bounces at cs.uiuc.edu [mailto:
> llvm-commits-bounces at cs.uiuc.edu] *On Behalf Of *Sam Parker
> *Sent:* Monday, May 11, 2015 13:57
> *To:* llvm-commits at cs.uiuc.edu
> *Subject:* [PATCH] improve vectorizers by removing cost of unnecessary
> truncs and exts.
>
>
>
> Hi,
>
>
>
> I’ve attached a patch to both the loop vectorizer and slp-vectorizer which
> checks to see whether truncs and extensions would actually be required if
> the code was vectorized. This is so that the vectorizers understand that
> the cost of these instructions is effectively zero if vectorization
> happens. This is helpful when working on smaller data types, such as i8 and
> i16, that do not have native support in general purpose registers, but are
> supported in vector register files.
>
>
>
> Regards,
>
> Sam
>
>
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150511/3076e3bc/attachment.html>
More information about the llvm-commits
mailing list