[LLVMdev] Attaching range metadata to IntrinsicInst

Tue Jun 17 15:04:29 PDT 2014

----- Original Message -----
> From: "Eric Christopher" <echristo at gmail.com>
> To: "Jingyue Wu" <jingyue at google.com>
> Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Tuesday, June 17, 2014 4:43:28 PM
> Subject: Re: [LLVMdev] Attaching range metadata to IntrinsicInst
> 
> On Tue, Jun 17, 2014 at 2:33 PM, Jingyue Wu <jingyue at google.com>
> wrote:
> > Hi Eric,
> >
> > In the IR, besides "target datalayout" and "target triple", we have
> > a
> > special "target cpu" string which is set by the Clang front-end
> > according to
> > its -target-cpu flag. We also write a Module::getTargetCPU() method
> > to
> > retrieve this string from the IR.
> >
> 
> Not sure that I like this. Each function can have a target cpu
> though.
> That each subtarget cares about the value of the target cpu for how
> the intrinsic works sounds a lot like TargetTransformInfo to me.
> 
> -eric

I also think we should avoid this; as I said earlier, ValueTracking is used during canonicalization, and the community consensus seems to be to try, to the extent possible, to make this canonical form backend independent (even for intrinsics). Having Clang add the range metadata seems preferable in this case (and, as a side effect, gives us a new generally-useful capability).

 -Hal

> 
> > Jingyue
> >
> >
> > On Tue, Jun 17, 2014 at 2:22 PM, Eric Christopher
> > <echristo at gmail.com>
> > wrote:
> >>
> >> Eh? How do you envision this?
> >>
> >> -eric
> >>
> >> On Tue, Jun 17, 2014 at 2:09 PM, Jingyue Wu <jingyue at google.com>
> >> wrote:
> >> > Hi Nick,
> >> >
> >> > That makes sense. I think a main issue here is that the ranges
> >> > of these
> >> > PTX
> >> > special registers (e.g., threadIdx.x) depend on -target-cpu
> >> > which is
> >> > only
> >> > visible to clang and llc. Would you mind we specify "target cpu"
> >> > in the
> >> > IR
> >> > similar to what we did for "target triple"?
> >> >
> >> > Thanks,
> >> > Jingyue
> >> >
> >> >
> >> > On Tue, Jun 17, 2014 at 12:19 PM, Nick Lewycky
> >> > <nlewycky at google.com>
> >> > wrote:
> >> >>
> >> >> On 17 June 2014 06:41, Eli Bendersky <eliben at google.com> wrote:
> >> >>>
> >> >>> On Tue, Jun 17, 2014 at 1:38 AM, Nick Lewycky
> >> >>> <nicholas at mxc.ca> wrote:
> >> >>>>
> >> >>>> Chandler Carruth wrote:
> >> >>>>>
> >> >>>>> This seems fine to me, but I'd like to make sure it looks OK
> >> >>>>> to Nick
> >> >>>>> as
> >> >>>>> well.
> >> >>>>
> >> >>>>
> >> >>>> I strongly prefer baking in knowledge about the intrinsics
> >> >>>> themselves
> >> >>>> into the passes if possible. Metadata will always be
> >> >>>> secondary.
> >> >>>
> >> >>>
> >> >>> So you're saying that in this particular case you'd prefer
> >> >>> LLVM passes
> >> >>> to
> >> >>> know about the range of these PTX intrinsics, rather than
> >> >>> Clang adding
> >> >>> them
> >> >>> as metadata?
> >> >>
> >> >>
> >> >> Yep.
> >> >>
> >> >>> ValueTracking.cpp already has some iffy target knowledge
> >> >>> (someone
> >> >>> sneaked
> >> >>> a direct  Intrinsic::x86_sse42_crc32_64_64 check in there),
> >> >>> but
> >> >>> extending it
> >> >>> to other intrinsics in other targets seems like too much...
> >> >>
> >> >>
> >> >> That's not iffy. That's exactly how it should work, and we
> >> >> should have
> >> >> more of that. There is a major gotcha and that's dealing with
> >> >> the case
> >> >> where
> >> >> the intrinsics don't exist because the backend wasn't compiled
> >> >> in. If
> >> >> x86_sse42_crc32_64_64 is in there (and also in instcombine
> >> >> btw),
> >> >> presumably
> >> >> that problem is solved somehow? Or does llvm actually not build
> >> >> if you
> >> >> don't
> >> >> enable the x86 target? I feel like we would've heard about
> >> >> that.
> >> >>
> >> >> Nick
> >> >>
> >> >>> So should target info be passed into it in some way? Any
> >> >>> suggestions
> >> >>> where to put it? TargetLibraryInfo? TargetTransformInfo? In
> >> >>> any case
> >> >>> this
> >> >>> seems like the target interface will have to be augmented, and
> >> >>> we'll
> >> >>> have to
> >> >>> carry an object around into ValueTracking's compute*
> >> >>> functions. If
> >> >>> this is
> >> >>> the right way, then this is the way it will be done - design
> >> >>> ideas are
> >> >>> appreciated.
> >> >>>
> >> >>> Eli
> >> >>>
> >> >>>
> >> >>>
> >> >>>>
> >> >>>>
> >> >>>> Separately, should value tracking look use range metadata
> >> >>>> when it's
> >> >>>> available? Absolutely.
> >> >>>>
> >> >>>> I think it should apply to all CallInst not just
> >> >>>> IntrinsicInst (which
> >> >>>> is
> >> >>>> derived from CallInst).
> >> >>>>
> >> >>>> Nick
> >> >>>>
> >> >>>>> On Tue, Jun 17, 2014 at 12:37 AM, Jingyue Wu
> >> >>>>> <jingyue at google.com
> >> >>>>> <mailto:jingyue at google.com>> wrote:
> >> >>>>>
> >> >>>>>     Hi,
> >> >>>>>
> >> >>>>>     The range metadata can only be attached to LoadInst for
> >> >>>>>     now. I
> >> >>>>> am
> >> >>>>>     considering extending its usage to IntrinsicInst so that
> >> >>>>>     the
> >> >>>>>     frontend can annotate the range of the return value of
> >> >>>>>     an
> >> >>>>> intrinsic
> >> >>>>>     call. e.g.,
> >> >>>>>     %a = call i32 @llvm.xxx(), !range !0
> >> >>>>>     !0 = metadata !{ i32 0, i23 1024 }
> >> >>>>>
> >> >>>>>     The motivation behind this extension is some
> >> >>>>>     optimizations we
> >> >>>>> are
> >> >>>>>     working on for CUDA programs. Some special registers in
> >> >>>>>     CUDA
> >> >>>>> (e.g.,
> >> >>>>>     threadIdx.x) are bounded per CUDA programming guide, and
> >> >>>>>     knowing
> >> >>>>>     their ranges can improve the precision of ValueTracking
> >> >>>>>     and
> >> >>>>> benefit
> >> >>>>>     optimizations such as InstCombine.
> >> >>>>>
> >> >>>>>     To implement this idea, we need ValueTracking to be
> >> >>>>>     aware of the
> >> >>>>>     ranges of these special variables. These special
> >> >>>>>     registers are
> >> >>>>> so
> >> >>>>>     far read-only and accessed using intrinsics. e.g.,
> >> >>>>>     %threadIdx.x = call i32
> >> >>>>>     @llvm.nvvm.read.ptx.sreg.tid.x().
> >> >>>>>
> >> >>>>>     One possible approach is to have ValueTracking compute
> >> >>>>>     the known
> >> >>>>>     bits of these intrinsics as special cases. This approach
> >> >>>>>     is
> >> >>>>> already
> >> >>>>>     taken for the x86_sse42_crc32_64_64 intrinsic. However,
> >> >>>>>     this
> >> >>>>>     approach may not be elegant because the ranges of these
> >> >>>>>     CUDA
> >> >>>>> special
> >> >>>>>     registers depend on the GPU compute capability specified
> >> >>>>>     by
> >> >>>>>     -target-cpu. For instance, blockIdx.x is bounded by
> >> >>>>>     65535 in
> >> >>>>> sm_20
> >> >>>>>     but 2^31-1 in sm_30. Exposing -target-cpu to
> >> >>>>>     ValueTracking is
> >> >>>>>     probably discouraged.
> >> >>>>>
> >> >>>>>     Therefore, the approach I am considering is to have
> >> >>>>>     clang
> >> >>>>> annotate
> >> >>>>>     the ranges of these CUDA special registers according to
> >> >>>>>     the
> >> >>>>>     -target-cpu flag, and have ValueTracking pick the range
> >> >>>>>     metadata
> >> >>>>> for
> >> >>>>>     optimization. By doing so, we hide the target-specific
> >> >>>>>     info from
> >> >>>>>     ValueTracking.
> >> >>>>>
> >> >>>>>     The code change in llvm minus clang won't be large. The
> >> >>>>>     core
> >> >>>>> change
> >> >>>>>     is only a few lines:
> >> >>>>>     http://reviews.llvm.org/differential/diff/10464/. If
> >> >>>>>     this
> >> >>>>> extension
> >> >>>>>     sounds good to you, I'll definitely add more tests and
> >> >>>>>     revise
> >> >>>>> the
> >> >>>>>     documents on range metadata.
> >> >>>>>
> >> >>>>>     Best,
> >> >>>>>     Jingyue
> >> >>>>>
> >> >>>>>     _______________________________________________
> >> >>>>>     LLVM Developers mailing list
> >> >>>>>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
> >> >>>>> http://llvm.cs.uiuc.edu
> >> >>>>>
> >> >>>>>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> _______________________________________________
> >> >>>>> LLVM Developers mailing list
> >> >>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >> >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >> >>>>
> >> >>>>
> >> >>>> _______________________________________________
> >> >>>> LLVM Developers mailing list
> >> >>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >> >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >> >>>
> >> >>>
> >> >>>
> >> >>> _______________________________________________
> >> >>> LLVM Developers mailing list
> >> >>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >> >>>
> >> >>
> >> >>
> >> >> _______________________________________________
> >> >> LLVM Developers mailing list
> >> >> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >> >>
> >> >
> >> >
> >> > _______________________________________________
> >> > LLVM Developers mailing list
> >> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >> >
> >
> >
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory