[PATCH] D80694: Improve lit.py's usability by highlighting failing lines.

Fri May 29 11:27:11 PDT 2020

yln added a comment.

> Is there no running CI?

There are bots (managed by different people) but usually lit is only tested as part of `check-all`.  We want to avoid breaking too many people, especially because this is a feature interacting with the external shell I can see it causing problems on different platforms/environments.

> Patch 1: Change behavior of `-v`. (and default and -vv behavior)
>  Patch 2: Add highlighting/colored output.
>  Patch 3: Updating https://llvm.org/docs/CommandGuide/lit.html
>  Patch 4: Rename `echo_all_commands` to `runWithCommandOutput`.

That sounds like a good plan.

> Also, when you say "patch", do you mean different change requests (with their own reviews), or different commits but in this current change request (all in the same review)?

Yes, different "revisions" with their own review each.

================
Comment at: llvm/utils/lit/lit/TestRunner.py:1637-1639
+    output = createScriptOutput(litConfig, script,
+                                out if out else err,
+                                status, exitCode)
----------------
varungandhi-apple wrote:
> yln wrote:
> > varungandhi-apple wrote:
> > > The `out if out else err` might seem a bit strange (this was tripping me up for a while). For reasons I haven't tried digging into (perhaps it has to do with internal vs external shell), looks like sometimes the command output that's applicable is coming from stdout and sometimes it is coming from stderr. You can look at the `shtest-run-at-line.py` test case to see this in action. So I ended up going with a best effort "try this or that". Maybe we can make this more precise.
> > > 
> > > [Now I'm wondering if this ^ should be a TODO in the code instead of a review comment...]
> > The code below makes me think that we should try to find the RUN line (`locateLastFailingRunLine(cmdOutput)`) in both stdout and stderr.
> We still need to break the tie somehow though; what happens if both the stdout and stderr have `RUN: at line` in them? Is your suggestion that we try finding in one, if that fails, then try finding it in the other?
That sounds reasonable.  How do you propose to break the tie? I would probably go stderr -> stdout since errors are more likely in the stderr?!

================
Comment at: llvm/utils/lit/lit/TestRunner.py:1480

+def locateLastFailingRunLine(outputStr):
+    """
----------------
varungandhi-apple wrote:
> yln wrote:
> > Phew, this function is pretty complicated.  Please make sure the major cases are covered by tests.
> Do you mean adding unit tests for testing this function by itself (I don't recall seeing any unit tests in the code)? Or do you mean integration tests?
I just meant: we have 4 different returns and should make sure that all code paths are tested.

================
Comment at: llvm/utils/lit/lit/TestRunner.py:1496
+    # However, this is not perfect either :(. For example, if an external shell
+    # prepends line numbers on 'set +x', then this will fail.
+
----------------
varungandhi-apple wrote:
> yln wrote:
> > Can we add a test + comment for the failing case so the behavior is documented.
> I will add the test. Could you elaborate on the comment part; I don't follow. What additional information should that comment describe?
I meant: add the test to "pin down" the behavior (even if it's not the optimal desired) and also put this explaining comment there.

================
Comment at: llvm/utils/lit/lit/TestRunner.py:1564
+               == OutputSettings.FullScriptWithHighlighting)
+        return highlightedOutput()
+
----------------
varungandhi-apple wrote:
> yln wrote:
> > Why is there a separate case for highlighted script output?  For command output with highlight (if supported).  I think we should do the same for script output.
> I don't understand the question. Could you rephrase? If there's no failure, there's nothing to be highlighted. Similarly, if someone is using `-a`, we are not doing any highlighting (i.e. we get `FullScript`). We only highlight when we actually have a failure and we were asked to highlight the output.
I think we should always highlight the failure if there is one (just as we do with command output).

================
Comment at: llvm/utils/lit/lit/cl_arguments.py:171

-    # Validate command line options
-    if opts.echoAllCommands:
-        opts.showOutput = True
+    # default output settings
+    opts.showOutputOnFailure = True
----------------
varungandhi-apple wrote:
> yln wrote:
> > I think it should be possible to remove most of this code by setting the defaults/destination in `add_argument()`.
> Do we expect the order of verbosity arguments to affect what the behavior is? The current code is saying that it should not affect the behavior. If you specify `-a` anywhere, you get all the output. We do have tests doing `-a -v` which seem to be expecting all the output, so that would need to be changed as well.
I think we should try to preserve existing behavior, if the new/other behavior isn't a clear improvement.  Here, I don't think the order should matter.

I just meant that this block of code could probably be simpler if we use the appropriate arpgarse features.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80694/new/

https://reviews.llvm.org/D80694