[llvm-dev] Status of llvm.experimental.vector.reduce.* intrinsics

Fri Aug 4 07:20:03 PDT 2017

Bitcasting is only valid between types of the same size, so you can bitcast
to i4 and then directly do a cmp i4 %castval, 0 etc.

Amara

On 4 August 2017 at 15:03, Haidl, Michael <michael.haidl at uni-muenster.de>
wrote:

> I assume smaller types like <4 x i1> are getting zero extended to e.g., i8?
>
> Am 04.08.2017 um 15:58 schrieb Amara Emerson:
> > Actually for mask vectors of i1 values, you don't need to use reductions
> > at all(although for SVE this is what we'll do). You can instead bitcast
> > the vector value to an i8/i16/whatever and then compare against zero.
> >
> > Amara
> >
> > On 4 August 2017 at 14:55, Haidl, Michael via llvm-dev
> > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> >
> >
> >     I am currently working on a transformation pass that transforms
> >     masked.load and masked.store intrinsics to (hopefully) increase
> >     performance on targets where masked.load and masked.store are not
> legal.
> >     To check if the loads and stores are necessary at all I take the mask
> >     for the masked operations and want to reduce them to a single value.
> >     vector.reduce.or seemed very handy to do the job.
> >
> >     I will take a look into the function you suggested. Maybe I can come
> up
> >     with something that drives the development of these intrinsics ahead.
> >
> >     Cheers,
> >     Michael
> >
> >     Am 04.08.2017 um 15:25 schrieb Amara Emerson:
> >      > Can you tell us what you're looking to do with the intrinsics?
> >      >
> >      > On all non-AArch64 targets the ExpandReductions pass will convert
> >     them
> >      > to the shuffle pattern as you're seeing. That pass was written in
> >     order
> >      > to allow experimentation of the effects of using reduction
> >     intrinsics at
> >      > the IR level only, hence we convert into the shuffles very late
> >     in the
> >      > pass pipeline.
> >      >
> >      > Since we haven't seen any adverse effects of representing the
> >     reductions
> >      > as intrinsics at the IR level, I think in that respect the
> intrinsics
> >      > have probably proven themselves to be stable. However the error
> >     you're
> >      > seeing is because the AArch64 backend still expects to deal with
> only
> >      > intrinsics it can *natively* support, and i1 is not a natively
> >     supported
> >      > type for reductions. See the code in
> >      > AArch64TargetTransformInfo.cpp:useReductionIntrinsic() for where
> we
> >      > decide which reduction types we can support.
> >      >
> >      > For these cases, we need to implement more generic legalization
> >     support
> >      > in order to either promote to a legal type, or in cases where the
> >     target
> >      > cannot support it as a native operation at all, to expand it to a
> >      > shuffle pattern as a fallback. Once we have all that in place, I
> >     think
> >      > we're in a strong position to move to the intrinsic form as the
> >      > canonical representation.
> >      >
> >      > FYI one of the motivating reasons for these to be introduced was
> to
> >      > allow non power-of-2 vector architectures like SVE to express
> >     reduction
> >      > operations.
> >      >
> >      > Amara
> >      >
> >      > On 4 August 2017 at 13:36, Haidl, Michael
> >     <michael.haidl at uni-muenster.de <mailto:michael.haidl at uni-muenster.de
> >
> >      > <mailto:michael.haidl at uni-muenster.de
> >     <mailto:michael.haidl at uni-muenster.de>>> wrote:
> >      >
> >      >     Hi Renato,
> >      >
> >      >     just to make it clear, I didn't implement reductions on
> >     x86_64 they just
> >      >     worked when I tried to lower an
> >      >     llvm.experimentel.vector.reduce.or.i1.v8i1 intrinsic. A
> >     shuffle pattern
> >      >     is generated for the intrinsic.
> >      >
> >      >              vpshufd $78, %xmm0, %xmm1       # xmm1 =
> xmm0[2,3,0,1]
> >      >              vpor    %xmm1, %xmm0, %xmm0
> >      >              vpshufd $229, %xmm0, %xmm1      # xmm1 =
> xmm0[1,1,2,3]
> >      >              vpor    %xmm1, %xmm0, %xmm0
> >      >              vpsrld  $16, %xmm0, %xmm1
> >      >              vpor    %xmm1, %xmm0, %xmm0
> >      >              vpextrb $0, %xmm0, %eax
> >      >
> >      >
> >      >     However, on AArche64 I encountered an unreachable where
> >     codegen does not
> >      >     know how to promote the i1 type. Since I am more familiar
> >     with the
> >      >     midlevel I have to start digging into codegen. Any hints
> >     where to start
> >      >     would be awesome.
> >      >
> >      >     Cheers,
> >      >     Michael
> >      >
> >      >     Am 04.08.2017 um 08:18 schrieb Renato Golin:
> >      >      > On 3 August 2017 at 19:48, Haidl, Michael via llvm-dev
> >      >      > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> >     <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>>
> >     wrote:
> >      >      >> thank you for the clarification. I tested the intrinsics
> >     x86_64
> >      >     and it
> >      >      >> seemed to work pretty well. Looking forward to try this
> >      >     intrinsics with
> >      >      >> the AArch64 backend. Maybe I find the time to look into
> >     codegen
> >      >     to get
> >      >      >> this intrinsics out of experimental stage. They seem
> >     pretty useful.
> >      >      >
> >      >      > In addition to Amara's point, it'd be good to have it
> >     working and
> >      >      > default for other architectures before we can move out of
> >      >     experimental
> >      >      > if we indeed intend to make it non-arch-specific (which we
> >     do).
> >      >      >
> >      >      > So, if you could share your code for the x86 port, that'd
> >     be great.
> >      >      > But if you could help with the final touches on the
> >     code-gen part,
> >      >      > that'd be awesome.
> >      >      >
> >      >      > cheers,
> >      >      > --renato
> >      >      >
> >      >
> >      >
> >
> >     _______________________________________________
> >     LLVM Developers mailing list
> >     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> >     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170804/51484f1e/attachment.html>