[LLVMdev] lit: deprecating trailing \ in RUN lines

Sun Dec 8 05:12:21 PST 2013

On Sun, Dec 8, 2013 at 4:24 AM, Alp Toker <alp at nuanti.com> wrote:

>
> On 08/12/2013 11:22, Chandler Carruth wrote:
>
>  On Sun, Dec 8, 2013 at 2:11 AM, Alp Toker <alp at nuanti.com> wrote:
>
>>  I'd like to propose deprecating and shortly thereafter removing the lit
>> test runner feature that concatenates RUN lines ending in a trailing \
>>
>
>  I'm really opposed to this. Especially for the Clang test suite where
> run lines are often *very* long and hard to organize, read, and edit
> without this feature.
>
>
>>
>> *Rationale:*
>>
>>
>>    - Trailing \ has a special meaning in various language standards that
>>    we support nowadays. In the C preprocessor, for example, it's handled
>>    _before_ comments. Various compilers handle this differently and it
>>    introduces needless incompatibilities.
>>
>>  What incompatibilities? I've never had this be an issue.
>
>
> It's an issue if you try to run the clang tests against other compilers,
> say to check compatibility with MSVC.
>
> The problem is that "the trailing backslash on a continued line is
> commonly referred to as a backslash-newline" -- ie. it's handled by the
> preprocessor, so has significance rather than being part of the comment.
>
> That causes dissonance between what the compiler sees and what lit.py sees
> for no particularly good reason. One of the nice properties of lit tests is
> that they're also valid compiler inputs, so trailing slash is a bit
> unfortunate.
>

AFAIK, the only interesting pattern of RUN lines remains valid compiler
input:

// RUN: ... \
// RUN: ...

This is fine because while the '\' may "surprisingly" make continue the
prior comment line, the next line consisted entirely of a comment, so
whatever. Clang's warning is even silenced here, and while MSVC and GCC may
warn, they still are required to accept the code.

That said, I don't think that we should make tests harder to read or write
just to work around problems in other compilers.

>
> It less of a problem if you're just consuming the suite with make
> check-all, more of a problem for authoring.
>
>
>
>
>
>>
>>    - Forgetting the trailing \ between two RUN lines causes the lines to
>>    be run individually. People have checked in tests which they believed were
>>    getting run whereas the features being tested were actually silently
>>    broken. I've been committing fixes for some of these but it's exceptionally
>>    time-consuming to hunt them down after the fact.
>>
>>  I'd like to understand the rate at which this happens (per RUN-line?
> per test-file?). It's never been a problem for me, but that is in part
> because I check that my tests fail without my change in addition to passing
> with my change.
>
>
> If only everyone did check their changes.
>
> See r194071 where trailing backslash caused a test to always succeed, and
> r194073 for the kind of long-term a broken test has on code quality.
>
> Looking at the SVN log for test/ in clang, and to a lesser extent LLVM
> core, I've been doing nearly all the no-op test fixes in the last few
> months. Not all of them are related to trailing \ but those are the most
> pernicious kinds.
>
> It's a bad idea to gamble that I or someone else will always be around
> taking the time to manually verify old tests to see if they do what they're
> meant to do.
>

This mostly seems to address the challenge of fixing existing tests, which
certainly is hard, but I'm more interested in the challenge of writing a
new test. That is, I'm not worried about the rate at which we clean this
up, but the rate at which we mess this up. if there are only a few cases of
this today after 10 years, then I actually think the problem isn't very
bad. Put another way, if this only happens once or twice a year, is this
really the biggest problem with our test suite?

>>    - Removing trailing \ will introduce the neat property that one RUN
>>    line corresponds precisely to one command that's executed. This is good for
>>    humans and will enable simplifications in the test runner.
>>
>>  FWIW, I've never really had a problem that needed this. The RUN: forms
> a prefix of a shell script in my head, and I know how to read shell scripts
> including multiple lines.
>
>
> The transformations lit does are really too complex and there's at least
> one known bug to do with closed pipes that's contributing to no-op tests
> (think the discussion thread was on cfe-dev).
>
> In a nutshell, the script output lit forms right now is not likely not the
> pipeline you had in your head ;-)
>

I understand that you think this is too complex, but I'm suggesting that
this particular aspect of lit does not seem too complex to at least one
other developer, and thus you shouldn't assume it to be true.

>
> We need to simplify this stuff to fix no-op test issues, and also to
> achieve improved source line information.
>
>
>
>
>>
>>    - Eliminating the trailing \ syntax will unblock work on improved
>>    failure source locations in lit. Right now, when the builders encounter a
>>    test RUN failure it's a matter of guesswork as to which RUN line is
>>    failing, and the cycle of commit-fix-and-watch-buildbots is laborious.
>>    We've all wasted time with this at some point and can totally do better.
>>
>>  While I would very much enjoy better failure reporting, I don't really
> understand why it needs this. We have a in-python parser for the RUN: lines
> which understands what lines a "command" spans?
>
>  Anyways, even though I would *really* like better failure reporting from
> lit, not at the cost of less readable tests. That's the tail wagging the
> dog IMO.
>
>
> So, my contention is that the \ is not making the long lines more
> readable, just pasting over the complexity and hiding bugs.
>
> After all, long pipelines aren't how people use the LLVM tools in the real
> world
>

Zero of the long *lines* that I care about involve long *pipelines*. They
all involve exactly one pipeline from Clang to FileCheck.

They are absolutely how Clang is used in the real world: as a compiler
accepting a huge number of flags from a build system.

> Another option is to use a different break marker and require RUN-NEXT: on
> continuation lines. But my view is that long RUN lines could do with
> simplification anyway, so removing the feature is a better way forward.
>

> I'll throw the ball in your court to see if you have a better solution
> going forward?
>

Uh, no, the ball is in your court to demonstrate that a pervasive change to
the testing infrastructure of LLVM is the right direction going forward. So
far you've presented the following as I see it:

1) Supposed compatibility, but I don't see *any* compatibility issues in
practice and I don't know why this is a priority.
2) Risk of programmer error resulting in false-pass tests. Legitimate
concern, still looking to quantify how bad this problem is.
3) Simplicity of the RUN-lines themselves. I disagree with your assessment,
so there at least doesn't appear to be clear agreement here.
4) Ability to provide accurate source locations. However, it doesn't seem
necessary to me as lit already correctly understands the set of lines
concatenated.

Of these, #2 is the one I really agree with. However, there are a *large*
number of ways to mess this up. It's not clear that this is the most common
and should thus be the priority.

On the other hand, I've presented a couple of reasons why the status quo
seems a good thing:

a) We need the ability to wrap long RUN lines for readability. This is a
real need in Clang's test suite, and I've seen it used well in LLVM's as
well.
b) The '\' character is widely used in shell, and RUN lines are (for better
or worse) a small subset of shell, so it seems reasonable to use them for
familiarity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131208/4e445547/attachment.html>