[LLVMdev] PTX builtin functions.

Justin Holewinski justin.holewinski at gmail.com
Mon Nov 21 09:31:02 PST 2011


On Mon, Nov 21, 2011 at 11:45 AM, Alberto Magni
<alberto.magni86 at gmail.com>wrote:

> On Mon, Nov 21, 2011 at 3:36 PM, Justin Holewinski
> <justin.holewinski at gmail.com> wrote:
> > On Mon, Nov 21, 2011 at 7:01 AM, Alberto Magni <
> alberto.magni86 at gmail.com>
> > wrote:
> >>
> >> Hi Justin,
> >>
> >> attached you find the patch for the integer max instruction.
> >> The multiclass PTX_INTRINSIC_INT3 in file PTXIntrinsicInstrInfo.td
> >> is almost an exact copy of  PTX_INT3 in PTXInstrInfo.td, maybe
> >> a modification of this class can be defined in a separate file.
> >
> >
> > I'm copying llvmdev.  We should keep discussions like this on the list
> for
> > the benefit of others.
>
> I always forget "Reply to All".
>
> > We can probably factor out a generic description, or even just use the
> > PTX_INT3 multiclass directly.  The PTXIntrinsicInstrInfo.td file is
> included
> > by PTXInstrInfo.td, so anything defined in PTXInstrInfo.td is available
> in
> > PTXIntrinsicInstrInfo.td.
>
> I agree with you but my class PTX_INTRINSIC_INT3 works with an Intrinsic
> and not with a SDNode, like PTX_INT3.
> PTX_INTRINSIC_INT3 also requires the presence of the type of
> the immediate in the pattern, e.g. (i32 imm:$b).
>

Alright, I'm fine with that.


>
> >>
> >>
> >> Do you agree with this approach ?
> >> Also, do you think that a class like PTX_INTRINSIC_INT3_SIGNED
> >> (a clone of PTX_INT3_SIGNED) is required ?
> >
> >
> > Yes, I believe we should split these into signed and unsigned variants.
>  The
> > results of max/min operations can definitely be different depending on
> > whether the operands are signed or unsigned.  Since this information is
> not
> > encoded in LLVM types, we may want to create two versions for each
> integer
> > type; something like:
> >
> > i32 @llvm.ptx.max.signed.i32(i32, i32)
> > i32 @llvm.ptx.max.unsigned.i32(i32, i32)
>
> Yes, this the only way.
>

A couple more comments:

   1. Please make sure to set TargetPrefix="ptx" for the intrinsics
   (probably best in the multiclass, see PTXReadSpecialRegisterIntrinsic_r32)
   2. I'm not sure how to define a GCCBuiltin for an intrinsic that can
   take multiple types, but it's probably worth looking into so we can expose
   this intrinsic to Clang.



>
> >
> > Otherwise, the patch looks good.
> >
> >>
> >>
> >> Thanks,
> >>
> >> Alberto
> >>
> >> On Wed, Nov 16, 2011 at 5:44 PM, Alberto Magni
> >> <alberto.magni86 at gmail.com> wrote:
> >> > On Wed, Nov 16, 2011 at 2:17 PM, Justin Holewinski
> >> > <justin.holewinski at gmail.com> wrote:
> >> >> On Wed, Nov 16, 2011 at 9:16 AM, Justin Holewinski
> >> >> <justin.holewinski at gmail.com> wrote:
> >> >>>
> >> >>> On Wed, Nov 16, 2011 at 8:05 AM, Alberto Magni
> >> >>> <alberto.magni86 at gmail.com>
> >> >>> wrote:
> >> >>>>
> >> >>>> Dear Justin,
> >> >>>>
> >> >>>> I am trying to add the support for some OpenCL builtin functions to
> >> >>>> the PTX backend.
> >> >>>> The attached file represent the first stub of a patch for the fmax
> >> >>>> builtin function.
> >> >>>
> >> >>> First off, thanks for helping to improve the PTX back-end!
> >> >>> There are really two main issues here.  First, OpenCL built-in
> >> >>> functions
> >> >>> do not belong in the PTX back-end.  These will be implemented in the
> >> >>> libclc
> >> >>> library (http://www.pcc.me.uk/~peter/libclc).  The back-end will
> only
> >> >>> implement PTX intrinsics, which may be used by the OpenCL built-in
> >> >>> functions
> >> >>> in libclc.  However, this particular function (max) corresponds to a
> >> >>> PTX
> >> >>> instruction, so it makes sense to implement it as an intrinsic in
> the
> >> >>> back-end.
> >> >>> Second, intrinsic functions require a bit more work.  You're off to
> a
> >> >>> great start, but intrinsics are implemented a bit differently.  It
> >> >>> looks
> >> >>> like LLVM does not have a max intrinsic, so we'll need to create
> one.
> >> >>>  Have
> >> >>> a look at include/llvm/IntrinsicsPTX.td.  This file defines the
> >> >>> PTX-specific
> >> >>> intrinsics.  You can add an intrinsic for max here, and then
> implement
> >> >>> a
> >> >>> pattern-match in the PTXInstrInfo.td file.  There is no need to
> create
> >> >>> a new
> >> >>> SDNode type for intrinsics, unless they require some special
> handling
> >> >>> in the
> >> >>> C++ code, which I do not see being the case here.
> >> >>
> >> >> Sorry, there's a typo here.  The intrinsic pattern matching goes in
> >> >> PTXInstrinsicInstrInfo.td.
> >> >>
> >> >
> >> > Thank you for the pointers I will let you know when I have the first
> >> > patch.
> >> >
> >> >>>
> >> >>> When you define a new intrinsic, use the following template as a
> name:
> >> >>> int_ptx_max.  This will define the LLVM intrinsic as
> @llvm.ptx.max().
> >> >>>  Please follow the same convention when naming the __builtin_*
> >> >>> function.
> >> >>>
> >> >>>>
> >> >>>> The test case I am trying is the following:
> >> >>>>
> >> >>>> define ptx_device float @f(float %x, float %y) {
> >> >>>> entry:
> >> >>>>  %z = call float @fmax(float %x, float %y)
> >> >>>>  ret float %z
> >> >>>> }
> >> >>>>
> >> >>>> declare float @fmax(float, float)
> >> >>>>
> >> >>>> But at the moment llc crashes saying that "calls are not
> supported",
> >> >>>> this does not
> >> >>>> happens with llvm builtins like llvm.sqrt.f32
> >> >>>
> >> >>> Which version of LLVM are you using?  Calls to PTX device functions
> >> >>> have
> >> >>> been implemented for a little while now, so I'm surprised to see
> that
> >> >>> error.
> >> >>>  Perhaps it's because the fmax function is not defined as
> ptx_device.
> >> >>>
> >> >
> >> > This is the testcase that I am using to verify I the max builtin
> >> > function I am impementing
> >> > is actually recognised. I took inspiration from the llvm-intrinsic.ll
> >> > test case.
> >> > The command I am using to compile is:
> >> >
> >> > llc -march=ptx32 -mattr=+ptx22 fmax.ll
> >> >
> >> > The option -mattr does not seem to have any effect.
> >> > I tried also with the ptx_device qualifier with the same outcome.
> >> > I am using llvm from the svn repository.
> >> >
> >> > Bye,
> >> >
> >> > Alberto
> >> >
> >> >>>>
> >> >>>> Can you please give me a hint on what I am missing, or some general
> >> >>>> advice on how
> >> >>>> to add builtin functions.
> >> >>>>
> >> >>>> Thank you in advance,
> >> >>>>
> >> >>>> Alberto.
> >> >>>>
> >> >>>> _______________________________________________
> >> >>>> LLVM Developers mailing list
> >> >>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >> >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >> >>>>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>>
> >> >>> Thanks,
> >> >>> Justin Holewinski
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >>
> >> >> Thanks,
> >> >> Justin Holewinski
> >> >>
> >
> >
> >
> >
> > --
> >
> > Thanks,
> >
> > Justin Holewinski
> >
>



-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111121/8d2d0ab7/attachment.html>


More information about the llvm-dev mailing list