<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/56830>56830</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [mlir] DCE optimistically assuming function not part of DataFlowAnalysis is unreachable
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            mlir:core
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          zero9178
      </td>
    </tr>
</table>

<pre>
    Reproducer that can be run via `mlir-opt %s --pass-pipeline="func.func(sccp)"`:
```mlir
func.func private @foo() -> index {
    %0 = arith.constant 10 : index
    return %0 : index
}

func.func private @bar(%arg0: index) -> index {
  %c0 = arith.constant 0 : index
  %1 = arith.constant 420 : index
  %7 = arith.cmpi eq, %arg0, %c0 : index
  cf.cond_br %7, ^bb1(%1 : index), ^bb2

^bb1(%8: index):  // 2 preds: ^bb0, ^bb4
  return %8 : index

^bb2:
  %13 = call @foo() : () -> index
  cf.br ^bb1(%13 : index)
}
```
this will incorrectly fold the `return %8 : index` to `return %c420 : index`, despite `^bb2` branching to `^bb1` with a different value as block argument.

The underlying issue is that:
https://github.com/llvm/llvm-project/blob/ab701975e7f3b63bb474afbdeb8c474950d41074/mlir/lib/Analysis/DataFlow/SparseAnalysis.cpp#L109-L117
queries for a predecessor state of a call that is never created/written to by DCE. Since PredecessorsState optimistically assumes all predecessors are know, and also has no known predecessors at the same time, it does nothing but return at line 117.
I can only guess, but it seems to me the intention was that this predecessor state should be set in either:
https://github.com/llvm/llvm-project/blob/ab701975e7f3b63bb474afbdeb8c474950d41074/mlir/lib/Analysis/DataFlow/DeadCodeAnalysis.cpp#L410
or
https://github.com/llvm/llvm-project/blob/ab701975e7f3b63bb474afbdeb8c474950d41074/mlir/lib/Analysis/DataFlow/DeadCodeAnalysis.cpp#L300
or maybe during priming of the analysis.
The former does not work as the analysis is scheduled on func.func operations, hence it never seeing the func.return of @foo.
The latter only sets the call as predecessor to the callable.
I think what is needed is to mark the predecessors of the call as unknown if the callable is not part of the analysis.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzVVsuSozYU_Rq8UdnFw4C98MLT7qlK1SxSmexTAl2M0iAxkmjH-focCeNHt2eRXeKijJDu89xzL1RanHe_0WC0GGsyzLXcsZorVhEzo2LvkrOoiPtOmqUeHIvS3LLlcuDWLgc5UCcVRdkhStNmVPXK_0Xpxtb1EKVb7EI3yvZRfIjivV-Hy1ubtq5KbDDynTti0TputIYN6LNllL0yqQT9xaLyy6TC8EMUMYNbxo107arWyjquHEv87n7SuEkbcqNRs9LDcVQeLoufh1NxE8LJuTnGN_2fhgfJ-ml0T4KDbPJMdJ0-Fy7vhftBMvoRpS9sDm5a1k9068YbF39UJpgJkvlrVSVTagm7T-x6mt6Dcy-_eRDPQnBfcbEUyJGwfivIx1db6zmUWzk2n8pxc5ReaTOhlIXMa951jxQJnj6y5S5rn_B9otljph9YMDN0enSttOwk4VKqWhtDtevOrNGdQJ-Q74vnuRQxc_rxuH6saBFwEWQH6YKhS87QrAxXdSvV8WLjEj1OTqg640zIpiFDYMk770Zi3LKq0_UbaHEce-yv7sH8HYGO8Gm6s7cprYUK0vKNfoW4dW7wNZuKeISfsQJfejx03ft8W2JK_AkM8AiHFW68KuNkW-ZUNllVZChyueZNJaja1Fhu81isk7hcQzT0POxIr7dXvDtbabE8cMe_dvqE5feBG0vz2aoeMEOyb0m8XX5LknKK9MdIRpJFEQyg8FyjmqzFExoHUOoG24ElYZIhUUXvmGu1IRwLeDmheRwpj251ZoeX1xX7jvIS-_VmzH6fjA1O9tI66Q2eAbQFvpZ563eesWGIvSmfwwvjSkDAataiLkqHffVB3AX6WN4TgwPyatIxoclruFD7anRzo0DcT1kGDC6V_SVMaK0Q03GEUW_AK8CIJeqtz83bhhOpkKuTWrETn4rOAqs_I2dbPYLYmPuWYEoxAg3I_Mc4ciAuXrT4xBKYmMLU5v8QbhZfw2U9PwN0MRpfd7x3en8Hj335-Kx3a2YwvwehZ7awkzZvLNT2Ju55b-uWxNiRAFHY7cWmBzLcEyKwpiXPfPBm6hKwJwwe78ZrXBiIYKaZexdGx9FFZiIhCDP5D43HH9kFLs5HvOroSmHP8zd2unYpNEQYTOAuR0pe6aFtLpDMPkY1tZZsHuwHY4AFs8R9QnFBu6TIt2mxLvL1Quwysc22fOGk62gX5V9CRfODnwpPm9-D44EJDXXvZS75_q4Ao8LMqVsf1GI03e5fMzLMak-ovNhk8aLdxetquynSssoqSkXFszhpiizNmnKTF3WRLIAAddZngk-vkEy2x4uL_IdYfljIXRqnaVxmSRLncZ6uylJs1iX0QW6-TglVpp7LbuUjWWlzXJhdCKoajxaHHfCwt0NgIo-oXHAI-3x0rTa7v8nobVJuFiGBXYj-H02TQ38">