[LLVMdev] Attaching range metadata to IntrinsicInst

Tue Jun 17 14:46:50 PDT 2014

----- Original Message -----
> From: "Nick Lewycky" <nlewycky at google.com>
> To: "Eli Bendersky" <eliben at google.com>
> Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Tuesday, June 17, 2014 2:19:57 PM
> Subject: Re: [LLVMdev] Attaching range metadata to IntrinsicInst
> 
> 
> 
> 
> 
> On 17 June 2014 06:41, Eli Bendersky < eliben at google.com > wrote:
> 
> 
> 
> 
> 
> 
> On Tue, Jun 17, 2014 at 1:38 AM, Nick Lewycky < nicholas at mxc.ca >
> wrote:
> 
> 
> 
> Chandler Carruth wrote:
> 
> 
> This seems fine to me, but I'd like to make sure it looks OK to Nick
> as
> well.
> 
> I strongly prefer baking in knowledge about the intrinsics themselves
> into the passes if possible. Metadata will always be secondary.
> 
> 
> 
> So you're saying that in this particular case you'd prefer LLVM
> passes to know about the range of these PTX intrinsics, rather than
> Clang adding them as metadata?
> 
> 
> Yep.
> 
> 
> 
> 
> 
> 
> 
> ValueTracking.cpp already has some iffy target knowledge (someone
> sneaked a direct Intrinsic::x86_sse42_crc32_64_64 check in there),
> but extending it to other intrinsics in other targets seems like too
> much...
> 
> 
> That's not iffy. That's exactly how it should work, and we should
> have more of that. There is a major gotcha and that's dealing with
> the case where the intrinsics don't exist because the backend wasn't
> compiled in. If x86_sse42_crc32_64_64 is in there (and also in
> instcombine btw), presumably that problem is solved somehow? Or does
> llvm actually not build if you don't enable the x86 target? I feel
> like we would've heard about that.

I think that the IR level intrinsics are still defined even if the corresponding backend is not enabled (they're in include/llvm/IR/Intrinsics<TARGET>.td which are all included by include/llvm/IR/Intrinsics.td).

 -Hal

> 
> 
> Nick
> 
> 
> 
> 
> 
> 
> 
> So should target info be passed into it in some way? Any suggestions
> where to put it? TargetLibraryInfo? TargetTransformInfo? In any case
> this seems like the target interface will have to be augmented, and
> we'll have to carry an object around into ValueTracking's compute*
> functions. If this is the right way, then this is the way it will be
> done - design ideas are appreciated.
> 
> 
> 
> Eli
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Separately, should value tracking look use range metadata when it's
> available? Absolutely.
> 
> I think it should apply to all CallInst not just IntrinsicInst (which
> is derived from CallInst).
> 
> Nick
> 
> 
> 
> 
> On Tue, Jun 17, 2014 at 12:37 AM, Jingyue Wu < jingyue at google.com
> 
> 
> <mailto: jingyue at google.com >> wrote:
> 
> Hi,
> 
> The range metadata can only be attached to LoadInst for now. I am
> considering extending its usage to IntrinsicInst so that the
> frontend can annotate the range of the return value of an intrinsic
> call. e.g.,
> %a = call i32 @llvm.xxx(), !range !0
> !0 = metadata !{ i32 0, i23 1024 }
> 
> The motivation behind this extension is some optimizations we are
> working on for CUDA programs. Some special registers in CUDA (e.g.,
> threadIdx.x) are bounded per CUDA programming guide, and knowing
> their ranges can improve the precision of ValueTracking and benefit
> optimizations such as InstCombine.
> 
> To implement this idea, we need ValueTracking to be aware of the
> ranges of these special variables. These special registers are so
> far read-only and accessed using intrinsics. e.g.,
> %threadIdx.x = call i32 @llvm.nvvm.read.ptx.sreg.tid. x().
> 
> One possible approach is to have ValueTracking compute the known
> bits of these intrinsics as special cases. This approach is already
> taken for the x86_sse42_crc32_64_64 intrinsic. However, this
> approach may not be elegant because the ranges of these CUDA special
> registers depend on the GPU compute capability specified by
> -target-cpu. For instance, blockIdx.x is bounded by 65535 in sm_20
> but 2^31-1 in sm_30. Exposing -target-cpu to ValueTracking is
> probably discouraged.
> 
> Therefore, the approach I am considering is to have clang annotate
> the ranges of these CUDA special registers according to the
> -target-cpu flag, and have ValueTracking pick the range metadata for
> optimization. By doing so, we hide the target-specific info from
> ValueTracking.
> 
> The code change in llvm minus clang won't be large. The core change
> is only a few lines:
> http://reviews.llvm.org/ differential/diff/10464/ . If this extension
> sounds good to you, I'll definitely add more tests and revise the
> documents on range metadata.
> 
> Best,
> Jingyue
> 
> ______________________________ _________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto: LLVMdev at cs.uiuc.edu >
> http://llvm.cs.uiuc.edu
> 
> http://lists.cs.uiuc.edu/ mailman/listinfo/llvmdev
> 
> 
> 
> 
> ______________________________ _________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/ mailman/listinfo/llvmdev
> 
> 
> 
> ______________________________ _________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/ mailman/listinfo/llvmdev
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory