[PATCH] D111500: [InstSimplify] Simplify intrinsic comparisons with domain knoweldge

Mon Oct 11 13:57:21 PDT 2021

tra added inline comments.

================
Comment at: llvm/lib/Analysis/InstructionSimplify.cpp:609
+  // fold %cmp = icmp slt i32 %tid, %ntid to true.
+  if (Inst0->getIntrinsicID() == Intrinsic::nvvm_read_ptx_sreg_tid_x &&
+      Inst1->getIntrinsicID() == Intrinsic::nvvm_read_ptx_sreg_ntid_x)
----------------
jhuber6 wrote:
> tra wrote:
> > nikic wrote:
> > > jhuber6 wrote:
> > > > tra wrote:
> > > > > What if LLVM has been compiled without NVPTX back-end? I'm not sure that NVVM intrinsics will be available then.
> > > > > 
> > > > > Perhaps we should re-visit enabling NVVMIntrRange.cpp pass, again. This should make it possible for LLVM to figure this optimization, and more.
> > > > Ranges would only give us an upper bound right? Maybe we could insert `llvm.assume` calls as wall as the ranges there, then . 
> > > > 
> > > > 
> > > > I think intrinsic functions are available, but I haven't checked. We use them in OpenMPOpt which is in the default pipeline and I haven't heard any complains so maybe it's probably fine.
> > > I believe intrinsics are always included, even if the target is disabled. But I also don't think we have precedent for target intrinsic handling in InstSimplify, so adding @spatel and @lebedev.ri for that. Though I don't really see a problem with it.
> > > 
> > > We do provide InstCombine hooks (instCombineIntrinsic in TTI), but those work directly on the intrinsic. You could use that to replace NVVMIntrRange I believe. Though I don't think that would cover the particular use-case here, because range metadata is not sufficient to derive this result.
> > > Ranges would only give us an upper bound right? 
> > 
> > Yes, they do not provide any info about relationship between launch grid parameters.
> > 
> > > Maybe we could insert llvm.assume calls as wall as the ranges there, then .
> > 
> > Something like that.
> > 
> I think using `llvm.assume` would be a good solution in general if we can get it to work, might make all of these cases automatic. Do we want to go down that avenue or just stick with this as the more straightforward option.
> 
> Is there a reason the NVVMIntrRange.cpp isn't currently enabled? Seems straightforward enough.
> Is there a reason the NVVMIntrRange.cpp isn't currently enabled? Seems straightforward enough.

It triggered odd regressions in tensorflow code that I was unable to find the root cause for.
With the pass providing only minor benefits, I've just got it disabled by default. I'll try to re-test with the pass enabled and see how it fares now.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111500/new/

https://reviews.llvm.org/D111500