[llvm] [AMDGPU] Omit umin on ctlz/cttz if operand is non-zero. (PR #79127)

Tue May 14 08:58:43 PDT 2024

PeddleSpam wrote:

> > > > Instead of doing this during the lowering, should the combine on CTLZ/CTTZ transform the non-undef version into the undef version if the input is known non-zero? I thought it was already doing that (it is https://github.com/llvm/llvm-project/blob/9731b77e80261c627d79980f8c275700bdaf6591/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L11005C6-L11005C7)
> > > 
> > > 
> > > It must be missing some cases. Otherwise the tests wouldn't change.
> > 
> > 
> > Yes, so should debug why that happened. We shouldn't need to reinvent optimizations during the lowering
> 
> `DagCombiner` doesn't catch this because we've already lowered `CTLZ` to `FFBH` before it runs.

Correction, it does hit `DAGCombiner` but the `isKnownNeverZero` check fails due to the recursion depth.

https://github.com/llvm/llvm-project/pull/79127