<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Making memmove unconditional and reducing bb count"
   href="https://bugs.llvm.org/show_bug.cgi?id=43836">43836</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Making memmove unconditional and reducing bb count
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Loop Optimizer
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>lebedev.ri@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>I'm not fully sure these samples fully reproduce the issue, but that is not
intentional.

It is not unheard of to apply a running summation (predictor) to each line
of image. I have seen it widely in image decoding.
But it's a bit ugly - for the first row some default value needs to be used,
and for each next row we could need to copy said initial value
from the beginning of the previous row.
Naively this results in pretty ugly branching.
But it can also be written with no extra branching:

#include <algorithm>

static constexpr int pred_size = 4;

void sink(int* data, int* pred);

void bad(int* data, int* pred, int len, int width) {
    for(int i = 0; i < len; i++) {
        int *row = data + width*i;
        if(i != 0) {
            std::copy_n(data + width*(i-1),
                        pred_size,
                        pred);
        }
        sink(row, pred);
    }
}
void subpar(int* data, int* pred, int len, int width) {
    for(int i = 0; i < len; i++) {
        int *row = data + width*i;
        std::copy_n(i == 0 ? pred : (data + width*(i-1)),
                    pred_size,
                    pred);
        sink(row, pred);
    }
}
void good(int* data, int* pred, int len, int width) {
    int* predNext = pred;
    for(int i = 0; i < len; i++) {
        int *row = data + width*i;
        // If i=0, this is a NOP copy, we memmove [pred, pred+pred_size) over
itself
        // Else, we copy [data + width*(i-1), data + width*(i-1) + pred_size)
        // from previous row into [pred, pred+pred_size).
        std::copy_n(predNext,
                    pred_size,
                    pred);
        predNext = row;
        sink(row, pred);
    }
}

<a href="https://godbolt.org/z/julQt_">https://godbolt.org/z/julQt_</a> 


I'm wondering how this optimization could be approached.
I guess there are several steps here:
1. Turn `bad()` into `subpar()`, by making mem copy unconditional (selecting
between the sources)
2. Replace select with phi
3. #43835 ?</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>