[PATCH] D29088: Do not create ctlz/cttz(X, false) when the target do not support zero defined ctlz/cttz.

Tue Jan 24 16:12:32 PST 2017

hfinkel added a comment.

In https://reviews.llvm.org/D29088#655338, @deadalnix wrote:

> @andreadb it hides the branch many of the subsequent optimization passes, resulting in bad codegen. I stumbled on bad codegen from cttz/ctlz several time recently. Thing that I noticed are : doing the 0 case check several time, failure to constant fold the 0 case when there is one, etc...
>
> It doesn't make sense to re-implement all of this in CodeGenprepare nor does it to special case it in the whole pipeline. Canonicalisation should help subsequent passes, not hide information to them, which is what this is doing.

There is obviously a tradeoff here. Canonicalizing toward the intrinsics provides later passes with more information about what the code is doing. There are a number of transformations and analysis that have special logic for ctlz (etc.). However, when we then expand the intrinsic, we need to make sure that we can optimize away redundancies in those expansions.

I really think we need some examples here. Many backends (although perhaps not x86) run EarlyCSE and other IR-level cleanups in CodeGen. Maybe that's a better solution here?

https://reviews.llvm.org/D29088