[llvm] update_test_checks: keep meta variables stable by default (PR #76748)
Nicolai Hähnle via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 4 11:12:14 PST 2024
================
@@ -1176,20 +1215,236 @@ def may_clash_with_default_check_prefix_name(check_prefix, var):
)
+VARIABLE_TAG = "[[@@]]"
+METAVAR_RE = re.compile(r"\[\[([A-Z0-9_]+)(?::[^]]+)?\]\]")
+NUMERIC_SUFFIX_RE = re.compile(r"[0-9]*$")
+
+
+class CheckValueInfo:
+ def __init__(
+ self,
+ nameless_value: NamelessValue,
+ var: str,
+ prefix: str,
+ ):
+ self.nameless_value = nameless_value
+ self.var = var
+ self.prefix = prefix
+
+
+class CheckLineInfo:
+ def __init__(self, line, values):
+ self.line: str = line
+ self.values: List[CheckValueInfo] = values
+
+ def __repr__(self):
+ return f"CheckLineInfo(line={self.line}, self.values={self.values})"
+
+
+def remap_metavar_names(
+ orig_line_infos: List[CheckLineInfo],
+ new_line_infos: List[CheckLineInfo],
+ committed_names: Set[str],
+) -> Mapping[str, str]:
+ """
+ Map all FileCheck variable names that appear in new_line_infos to new
+ FileCheck variable names in an attempt to reduce the diff from orig_line_infos
+ to new_line_infos.
+ """
+ # Initialize uncommitted identity mappings
+ new_mapping = {}
+ for line in new_line_infos:
+ for value in line.values:
+ new_mapping[value.var] = value.var
+
+ # Recursively commit to the identity mapping or find a better one
+ def recurse(
+ orig_line_infos: List[CheckLineInfo], new_line_infos: List[CheckLineInfo]
+ ):
+ if not new_line_infos or not orig_line_infos:
+ return
+
+ lines = set()
+
+ # Search for lines that are identical on both sides, including meta
+ # variable names, and commit to those names immediately
+ for line in orig_line_infos:
+ key = (line.line.strip(), tuple(value.var for value in line.values))
+ lines.add(key)
+
+ for line in new_line_infos:
+ key = (
+ line.line.strip(),
+ tuple(new_mapping[value.var] for value in line.values),
+ )
+ if key in lines:
+ for value in line.values:
+ committed_names.add(new_mapping[value.var])
+
+ # Search for lines that are unique on both sides if we only consider
+ # variable names that have been committed.
----------------
nhaehnle wrote:
The point of this is to bound the runtime complexity: it keeps the number of edges in the matching graph linear, and so prevents the overall running time from becoming quadratic in the worst case.
This is apparently a fairly common trick in diff implementations.
It's not a significant simplification of the algorithm. The only gotcha is that if we allow non-unique matches, we need to be careful about how we sort the matching edges: they have to matched in *increasing* old line number first, and then in *decreasing* new line number. This isn't hard to see with the line sweeping intuition and the standard trick of applying an epsilon perturbation.
https://github.com/llvm/llvm-project/pull/76748
More information about the llvm-commits
mailing list