[Mlir-commits] [mlir] [mlir][dataflow] Update dataflow tutorial doc and add dataflow example code (PR #149296)

Oleksandr Alex Zinenko llvmlistbot at llvm.org
Thu Aug 28 12:59:07 PDT 2025


================
@@ -5,20 +5,361 @@ daunting and/or complex. A dataflow analysis generally involves propagating
 information about the IR across various different types of control flow
 constructs, of which MLIR has many (Block-based branches, Region-based branches,
 CallGraph, etc), and it isn't always clear how best to go about performing the
-propagation. To help writing these types of analyses in MLIR, this document
-details several utilities that simplify the process and make it a bit more
-approachable.
+propagation. Dataflow analyses often require implementing fixed-point iteration
+when data dependencies form cycles, as can happen with control-flow. Tracking
+dependencies and making sure updates are properly propagated can get quite
+difficult when writing complex analyses. That is why MLIR provides a framework
+for writing general dataflow analyses as well as several utilities to streamline
+the implementation of common analyses. The code and test from this tutorial can 
+be found in `mlir/examples/dataflow`.
+
+## DataFlow Analysis Framework
+
+MLIR provides a general dataflow analysis framework for building fixed-point
+iteration dataflow analyses with ease and utilities for common dataflow
+analyses. Because the landscape of IRs in MLIR can be vast, the framework is
+designed to be extensible and composable, so that utilities can be shared across
+dialects with different semantics as much as possible. The framework also tries
+to make debugging dataflow analyses easy by providing (hopefully) insightful
+logs with `-debug-only="dataflow"`.
+
+Suppose we want to compute at compile-time the constant-valued results of
+operations. For example, consider:
+
+```mlir
+%0 = string.constant "foo"
+%1 = string.constant "bar"
+%2 = string.concat %0, %1
+```
+We can determine with the information in the IR at compile time the value of
+`%2` to be "foobar". This is called constant propagation. In MLIR's dataflow
+analysis framework, this is in general called the "analysis state of a program
+point"; the "state" being, in this case, the constant value, and the "program
+point" being the SSA value `%2`.
+
+The constant value state of an SSA value is implemented as a subclass of
+`AnalysisState`, and program points are represented by the `ProgramPoint` union,
+which can be operations, SSA values, or blocks. They can also be just about
+anything, see [Extending ProgramPoint](#extending-programpoint). In general, an
+analysis state represents information about the IR computed by an analysis. 
+
+Let us define an analysis state to represent a compile time known string value
+of an SSA value:
+
+```c++
+class StringConstant : public AnalysisState {
+  /// This is the known string constant value of an SSA value at compile time
+  /// as determined by a dataflow analysis. To implement the concept of being
+  /// "uninitialized", the potential string value is wrapped in an `Optional`
+  /// and set to `None` by default to indicate that no value has been provided.
+  std::optional<std::string> stringValue = std::nullopt;
+
+public:
+  using AnalysisState::AnalysisState;
+
+  /// Return true if no value has been provided for the string constant value.
+  bool isUninitialized() const { return !stringValue.has_value(); }
+
+  /// Default initialized the state to an empty string. Return whether the value
+  /// of the state has changed.
+  ChangeResult defaultInitialize() {
+    // If the state already has a value, do nothing.
+    if (!isUninitialized())
+      return ChangeResult::NoChange;
+    // Initialize the state and indicate that its value changed.
+    stringValue = "";
+    return ChangeResult::Change;
+  }
+
+  /// Get the currently known string value.
+  StringRef getStringValue() const {
+    assert(!isUninitialized() && "getting the value of an uninitialized state");
+    return stringValue.value();
+  }
+
+  /// "Join" the value of the state with another constant.
+  ChangeResult join(const Twine &value) {
+    // If the current state is uninitialized, just take the value.
+    if (isUninitialized()) {
+      stringValue = value.str();
+      return ChangeResult::Change;
+    }
+    // If the current state is "overdefined", no new information can be taken.
+    if (stringValue->empty())
+      return ChangeResult::NoChange;
----------------
ftynse wrote:

This looks strange. Is the empty string being used as the lattice top (overdefined)? Beyond the empty string being a perfectly valid string, it is also used as a "default" value above, meaning the lattice is default-initialized to the top state and one cannot get out of it by doing only join.

https://github.com/llvm/llvm-project/pull/149296


More information about the Mlir-commits mailing list