[llvm-dev] [RFC] Vector Predication
Saito, Hideki via llvm-dev
llvm-dev at lists.llvm.org
Thu Jan 31 17:41:26 PST 2019
I think you and I are talking two different things.
As far as Intel’s vector function ABI is concerned, unless the programmer specifically says otherwise, given an OpenMP declare simd function, compiler will
deduce the VF from HW vector register size and other function signatures. Of course, there can be different vector function ABIs for different targets. Intel
compiler cost model uses vector function VF as part of loop vectorization VF determination. So, it’s tightly coupled.
A hypothetical vector target may vectorize such a vector function for 4096b vector, with an explicit VF parameter 20 also passed to it, to execute only the lower
20-elements parts of the whole thing.
I think this scenario answers Philip’s question on why separate mask and VF parameters and why VF can’t be conservatively deduced from the mask/mask compute.
From: Bruce Hoult [mailto:bruce at hoult.org]
Sent: Thursday, January 31, 2019 5:13 PM
To: Saito, Hideki <hideki.saito at intel.com>
Cc: Philip Reames <listmail at philipreames.com>; Robin Kruppe <robin.kruppe at gmail.com>; David Greene <dag at cray.com>; via llvm-dev <llvm-dev at lists.llvm.org>; Maslov, Sergey V <sergey.v.maslov at intel.com>; Topper, Craig <craig.topper at intel.com>
Subject: Re: [llvm-dev] [RFC] Vector Predication
On Thu, Jan 31, 2019 at 4:31 PM Saito, Hideki via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
>when we have a mask loaded from an external source (memory, function call boundary, etc...) and a short sequence of vector ops
Mask value from function call parameter is common. OpenMP declare simd function does exactly that for the masked cases.
Such a mask is at the application level, not at the vector strip-mining loop level.
As well as possibly being many times longer than the masks the hardware works with, it's likely to not even in the the format the hardware uses: different library APIs might pack a mask into bits, or one mask element per byte, short, or int.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev