[cfe-dev] Replacing the non-portable diff invocation coded as %diff_plist (and friends)

Hubert Tong via cfe-dev cfe-dev at lists.llvm.org
Tue Apr 2 13:07:59 PDT 2019

Various tests in `clang/test/Analysis/` invoke `diff` with a non-portable
`-I` option. I do not believe that a requirement for a `diff` utility that
supports such an extension is intended by the community. I believe that
replacing `diff -I` with `diff` on files that have been normalized using
`grep -v` is a reasonable solution. A minor detail is that the output being
checked is produced without a newline at the end of the file. As part of
the proposed solution, the file is rehabilitated as a text file (as
required by `grep`) by adding such a newline. The same is done for the
reference expected files that are missing a newline at the end of the file.
The solution is sketched out below for feedback.

The normalization is in many cases encoded in a lit substitution
`%diff_plist`. A sample application of the proposed change would look like
the following. See further below for additional rationale on the

Replace the RUN line:
-// RUN: cat %t.plist | %diff_plist %S/Inputs/expected-plists/unix-fns.c.plist
+// RUN: %normalize_plist <%S/Inputs/expected-plists/unix-fns.c.plist
>%t.expected.sed.plist && echo >>%t.plist && %normalize_plist <%t.plist |
diff -u %t.expected.sed.plist -

Replace the lit substitution:
-# Diff command used by Clang Analyzer tests (when comparing .plist files
+# Filtering command used by Clang Analyzer tests (when comparing .plist
 # with reference output)
-    'diff -u -w -I "<string>/" -I "<string>.:" -I "version"'))
+    "grep -Ev '%s|%s|%s'" %
+        ('^[[:space:]]*<string>.* version .*</string>$',
+         '^[[:space:]]*<string>/.*</string>$',
+         '^[[:space:]]*<string>.:.*</string>$')))

Why not use more `RUN` lines?
The output being checked is line-number sensitive. Keeping the same number
of `RUN` lines minimizes unnecessary noise and risk of errors.

Why not use `sed`?
`sed` can be used to normalize the lines that are expected to vary;
however, the some of the expected-output files have these lines omitted.
The `grep` replacement is more compatible with the behaviour of `diff -I`.
The filtered files are nevertheless named with "sed" for the connotation
that substitution took place.

Why remove `-w` from `diff`?
It appears that the `-w` option was being used for its effect (in some
implementations) of ignoring the difference between a file that ends with a
newline and one which does not. Following normalization of both files to
end in a newline, this is no longer necessary. Since the effect is
non-portable, keeping the option (and thus allowing the non-portable effect
to occur, and perhaps become relied upon during development) is potentially

Why not add the newline via `cat`?
The newline can be added (in this context) using `echo | cat file -` as
opposed to using `echo >>file`; however, the `lit` implementation of `cat`
does not operate as expected on the standard input.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190402/d4979a9b/attachment.html>

More information about the cfe-dev mailing list