[PATCH] D103788: [InstCombine] Eliminate casts to optimize ctlz operation

Thu Jun 10 08:18:59 PDT 2021

datta.nagraj added a comment.

In D103788#2810270 <https://reviews.llvm.org/D103788#2810270>, @spatel wrote:

> In D103788#2810167 <https://reviews.llvm.org/D103788#2810167>, @lebedev.ri wrote:
>
>> In D103788#2810074 <https://reviews.llvm.org/D103788#2810074>, @datta.nagraj wrote:
>>
>>> @lebedev.ri Sir, but if we don't take trunc into consideration, then how are we optimizing this?
>>> Earlier it was 2 insts (zext - ctlz), and now its 3 insts(ctlz - add - zext). I agree that the ctlz , add are done on a lower datatype.
>>> If we consider trunc as well, then we can convert 3 insts (zext - ctlz - trunc) to 2 insts (ctlz - add).
>>> Shouldn't we do this opt only when the trunc is present?
>>
>> In general - yes, in instcombine we should not increase the instruction count,
>> however as it was already hinted by @spatel, this seems like a rare edge case
>> where we should be okay with that. (unless @spatel disagrees?)
>
> This could go either way, but I'd prefer that we have this more conventional (don't increase instructions) transform as a first step. If there's evidence that we benefit from the more general (no trunc required) transform, then we can always follow-up.
> Definitely agree that we need to check for matching source and destination types to make this patch sound (and add a negative test to verify that).

Added a check to see that the source and destination types match. Added the negative test as well to verify that.
Also added a check to do this opt only for width more than 5 as @lebedev.ri Sir, has pointed out that this won't work for bitwidth below 6 and would require an additional zext.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103788/new/

https://reviews.llvm.org/D103788