[PATCH] D20774: [InstCombine] look through bitcasts to find selects

Sat May 28 15:05:52 PDT 2016

spatel added inline comments.

================
Comment at: lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:1335
@@ +1334,3 @@
+
+  // Canonicalize SExt, Not, or BitCast to the LHS.
+  if (match(Op1, m_SExt(m_Value())) || match(Op1, m_Not(m_Value())) ||
----------------
eli.friedman wrote:
> spatel wrote:
> > eli.friedman wrote:
> > > This canonicalization seems incomplete... for example, if both operands are sext instructions, the one which belongs on the LHS is the one where the source is a bool, but this just picks randomly.
> > Yes, you're right. I was going to say it is just an existing bug for the sext/not cases, but it seems we don't optimize the bitcast case either:
> > 
> >   define <2 x i64> @vecBitcastOp0(<4 x i32> %a, <8 x i16> %b) {
> >     %bc1 = bitcast <4 x i32> %a to <2 x i64>
> >     %bc2 = bitcast <8 x i16> %b to <2 x i64>
> >     %and = and <2 x i64> %bc1, %bc2
> >     ret <2 x i64> %and
> >   }
> > 
> > I thought we'd eliminate one bitcast (randomly?) for this case and do the logic op in the format of one of the inputs. Is that a valid IR optimization?
> > 
> Well, you can't transform the return type of a function unconditionally; it could change the calling convention.
> 
> But to actually answer your question, you can perform an "and" in any type of the right size as long as there aren't any padding bits involved; "and" is a bit-wise operation. Whether that's a good idea probably depends on the target; performing an "and" in an illegal type is likely to be expensive.
Sorry - I didn't finish that example properly here. We'd of course need a bitcast after the "promoted" logic op to get the types to match. Please let me know what you think of this example:
https://llvm.org/bugs/show_bug.cgi?id=27925

================
Comment at: lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:1359
@@ +1358,3 @@
+  // so the types all line up. We're assuming that bitcasts are free and, as
+  // above, a select is cheaper than the combination of sext + and.
+  //
----------------
eli.friedman wrote:
> spatel wrote:
> > eli.friedman wrote:
> > > Assuming bitcasts are free is kind of a big assumption... you could easily end up in situations where the bitcasts are not free.  If X is scalar, the extra bitcasts should be folded away, but you could end up with bad effects on code like "((uint64_t)(vec1 == vec2)) & x".
> > Would it be fair to assume that vector bitcasts to another vector type are free? If not, then I suppose I need to abandon this patch and try again in the DAG?
> Vector to vector bitcasts are free on most platforms, but not all.  As a practical example, big-endian AArch64 has non-trivial vector bitcasts: a bitcast translates to a vector shuffle to put the elements in the right places.
> 
> In theory, there's infrastructure for querying costs like this from transform passes; see TargetTransformInfo::getCastInstrCost.  Not sure how that would work out in practice.
Based on my experience so far, I assume we'd just delay this kind of combining to the DAG rather than attempt it in IR in a target-dependent way. It sounds like that's what needs to happen in this case: the problem can't be generalized to all targets, so I'll make an x86 solution to start and other targets can pick it up if they like.

I'll let this sit a bit in case anyone else wants to comment, but for now I'm assuming I will eventually abandon this patch. Thank you for the feedback, Eli!

http://reviews.llvm.org/D20774