[llvm-dev] Status of llvm.experimental.vector.reduce.* intrinsics
Haidl, Michael via llvm-dev
llvm-dev at lists.llvm.org
Fri Aug 4 07:03:31 PDT 2017
I assume smaller types like <4 x i1> are getting zero extended to e.g., i8?
Am 04.08.2017 um 15:58 schrieb Amara Emerson:
> Actually for mask vectors of i1 values, you don't need to use reductions
> at all(although for SVE this is what we'll do). You can instead bitcast
> the vector value to an i8/i16/whatever and then compare against zero.
>
> Amara
>
> On 4 August 2017 at 14:55, Haidl, Michael via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
>
> I am currently working on a transformation pass that transforms
> masked.load and masked.store intrinsics to (hopefully) increase
> performance on targets where masked.load and masked.store are not legal.
> To check if the loads and stores are necessary at all I take the mask
> for the masked operations and want to reduce them to a single value.
> vector.reduce.or seemed very handy to do the job.
>
> I will take a look into the function you suggested. Maybe I can come up
> with something that drives the development of these intrinsics ahead.
>
> Cheers,
> Michael
>
> Am 04.08.2017 um 15:25 schrieb Amara Emerson:
> > Can you tell us what you're looking to do with the intrinsics?
> >
> > On all non-AArch64 targets the ExpandReductions pass will convert
> them
> > to the shuffle pattern as you're seeing. That pass was written in
> order
> > to allow experimentation of the effects of using reduction
> intrinsics at
> > the IR level only, hence we convert into the shuffles very late
> in the
> > pass pipeline.
> >
> > Since we haven't seen any adverse effects of representing the
> reductions
> > as intrinsics at the IR level, I think in that respect the intrinsics
> > have probably proven themselves to be stable. However the error
> you're
> > seeing is because the AArch64 backend still expects to deal with only
> > intrinsics it can *natively* support, and i1 is not a natively
> supported
> > type for reductions. See the code in
> > AArch64TargetTransformInfo.cpp:useReductionIntrinsic() for where we
> > decide which reduction types we can support.
> >
> > For these cases, we need to implement more generic legalization
> support
> > in order to either promote to a legal type, or in cases where the
> target
> > cannot support it as a native operation at all, to expand it to a
> > shuffle pattern as a fallback. Once we have all that in place, I
> think
> > we're in a strong position to move to the intrinsic form as the
> > canonical representation.
> >
> > FYI one of the motivating reasons for these to be introduced was to
> > allow non power-of-2 vector architectures like SVE to express
> reduction
> > operations.
> >
> > Amara
> >
> > On 4 August 2017 at 13:36, Haidl, Michael
> <michael.haidl at uni-muenster.de <mailto:michael.haidl at uni-muenster.de>
> > <mailto:michael.haidl at uni-muenster.de
> <mailto:michael.haidl at uni-muenster.de>>> wrote:
> >
> > Hi Renato,
> >
> > just to make it clear, I didn't implement reductions on
> x86_64 they just
> > worked when I tried to lower an
> > llvm.experimentel.vector.reduce.or.i1.v8i1 intrinsic. A
> shuffle pattern
> > is generated for the intrinsic.
> >
> > vpshufd $78, %xmm0, %xmm1 # xmm1 = xmm0[2,3,0,1]
> > vpor %xmm1, %xmm0, %xmm0
> > vpshufd $229, %xmm0, %xmm1 # xmm1 = xmm0[1,1,2,3]
> > vpor %xmm1, %xmm0, %xmm0
> > vpsrld $16, %xmm0, %xmm1
> > vpor %xmm1, %xmm0, %xmm0
> > vpextrb $0, %xmm0, %eax
> >
> >
> > However, on AArche64 I encountered an unreachable where
> codegen does not
> > know how to promote the i1 type. Since I am more familiar
> with the
> > midlevel I have to start digging into codegen. Any hints
> where to start
> > would be awesome.
> >
> > Cheers,
> > Michael
> >
> > Am 04.08.2017 um 08:18 schrieb Renato Golin:
> > > On 3 August 2017 at 19:48, Haidl, Michael via llvm-dev
> > > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>>
> wrote:
> > >> thank you for the clarification. I tested the intrinsics
> x86_64
> > and it
> > >> seemed to work pretty well. Looking forward to try this
> > intrinsics with
> > >> the AArch64 backend. Maybe I find the time to look into
> codegen
> > to get
> > >> this intrinsics out of experimental stage. They seem
> pretty useful.
> > >
> > > In addition to Amara's point, it'd be good to have it
> working and
> > > default for other architectures before we can move out of
> > experimental
> > > if we indeed intend to make it non-arch-specific (which we
> do).
> > >
> > > So, if you could share your code for the x86 port, that'd
> be great.
> > > But if you could help with the final touches on the
> code-gen part,
> > > that'd be awesome.
> > >
> > > cheers,
> > > --renato
> > >
> >
> >
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>
More information about the llvm-dev
mailing list