[PATCH] D45733: [DAGCombiner] Unfold scalar masked merge if profitable

Wed Apr 18 08:12:56 PDT 2018

lebedev.ri added a comment.

In https://reviews.llvm.org/D45733#1070963, @spatel wrote:

> > If the mask is constant, right now i always unfold it.
>
> Let me make sure I understand. The fold in question is:
>
>   %n0 = xor i4 %x, %y
>   %n1 = and i4 %n0, C1
>   %r  = xor i4 %n1, %y
>   =>
>   %mx = and i4 %x, C1
>   %my = and i4 %y, ~C1
>   %r = or i4 %mx, %my

Yes.

> If that's correct, we need to take a step back here. If the fold is universally good, then it can go in InstCombine

Yeah, that is the question, i'm having. I did look at mca output.
Here is what MCA says about that for `-mtriple=aarch64-unknown-linux-gnu -mcpu=cortex-a75`
F5971838: diff.txt <https://reviews.llvm.org/F5971838>
Or is this a scheduling info problem?

> and there's no need to add code bloat to the DAG to handle the pattern unless something in the backend can create this pattern (seems unlikely).

> But we need to take another step back before we add code bloat to InstCombine. Is there evidence that this pattern exists in source (bug report, test-suite, etc) and affects analysis/performance? If not, is it worth the cost of adding a matcher for the pattern? It's a simple matcher, so the expense bar is low...but if it never happens, do we care?

Repository:
  rL LLVM

https://reviews.llvm.org/D45733