[PATCH] D52146: [InstCombine] foldICmpWithLowBitMaskedVal(): handle ~(-1 << y) mask

Tue Sep 18 07:47:12 PDT 2018

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

In https://reviews.llvm.org/D52146#1237294, @lebedev.ri wrote:

> In https://reviews.llvm.org/D52146#1237101, @spatel wrote:
>
> > In https://reviews.llvm.org/D52146#1236849, @lebedev.ri wrote:
> >
> > > In https://reviews.llvm.org/D52146#1236832, @spatel wrote:
> > >
> > > > Is this a generalization of https://reviews.llvm.org/D52001 (and if so, can we remove that code as a special-case of the more general pattern)?
> > >
> > >
> > > No, i don't think so, why?
> > >
> > > Like i wrote in the description:
> > >
> > > > Two folds are happening here:
> > > > 
> > > > 1. https://rise4fun.com/Alive/oaFX
> > > > 2. And then `foldICmpWithHighBitMask()` (https://reviews.llvm.org/D52001) fires on newly formed IR: https://rise4fun.com/Alive/wsP4
> > >
> > > I.e. this fold actually allows https://reviews.llvm.org/D52001 to happen.
> >
> >
> > Sorry, I misread what was happening in this patch series.
>
>
> Great. I was worrying i was missing something obvious here.
>
> In https://reviews.llvm.org/D52146#1237101, @spatel wrote:
>
> > Things look logically correct,
>
>
>
>
> > but we should clarify for the record - what is the higher-level motivation for these folds?
> >  Particularly, for the cases with non-canonical pattern matching - do we have some important application
> >  that will benefit or data for how often those patterns occur across a range of benchmarks?
>
> The main reason is consistency.
>  I have seen all 4 of these variants to produce the mask in the wild.
>  I'm not aware of some 5'th variant. (at least right now?)
>
> Sure, we may just forget these non-canonical patterns exist for the matter of this fold,
>  like we do in most other cases, but if we fail to canonicalize them, well, we fail.
>
> I certainly understand the point about 'death by thousand cuts', but well,
>  if two more trivial patters notably matter, i suspect there is some larger problem..
>  (It would/will be oh so much simpler if each and every patter needn't be to manually found/tested/folded..)

Agreed - not sure how much work that automated/better solution requires, but I guess we'll eventually get there. But in the meantime, we should be aware of the compile-time implications of the current instcombine. Therefore, not adding folds without some real-world motivation. The perf of visitICmpInst is probably 1 of the most concerning for those that are watching the compile-time creep up, so that's why I'm raising it here.

> Stats are not as impressive. LLVM stage 2
> 
>   $ # (when the mask is constant, the actual number is higher since it fires more than once in some files)
>   $ find -iname *.stats | xargs grep D52146_CONSTANT | wc -l
>   157
>   $ # the canonical pattern variants.
>   $ find -iname *.stats | xargs grep D52146_CANONICAL
>   ./llvm-stage2/tools/lld/lib/ReaderWriter/MachO/MachONormalizedFileToAtoms.stats:        "instcombine.D52146_CANONICAL": 1,
>   ./llvm-stage2/tools/lld/wasm/InputChunks.stats: "instcombine.D52146_CANONICAL": 2,
>   $ find -iname *.stats | xargs grep D52146_UNCANONICAL
>   $ # well yeah, zero. given how few D52146_CANONICAL there are, i suspect this is not representable.

Thanks for collecting the data. So the non-canonical cases are closer to the edge, but this patch is definitely justified. LGTM.

Repository:
  rL LLVM

https://reviews.llvm.org/D52146