[PATCH] D52999: [FileCheck] Annotate input dump

Joel E. Denny via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 10 12:24:34 PDT 2018


jdenny added a comment.

In https://reviews.llvm.org/D52999#1260736, @george.karpenkov wrote:

> OK that's somewhat more clear, but I'm still somewhat confused. Line by line:
>
>   $ FileCheck -v -dump-input=always checks1 < input1 |& sed -n '/^Key for/,$p'
>
>
> I assume `sed` is there to suppress all output before the legend  is printed?


Yes.  If you think all the usual diagnostics should be included here, I can remove the sed command.

> - Should legend be always printed? Wouldn't it make more sense to dump it only if requested?

I considered that.  However, the output is already typically long given that we're dumping the input and probably have -v enabled. The legend doesn't add much more, so I don't see the harm in including it always. Why make life more difficult by forcing the user to try again with another option? It seems best to always put the information close to where the user needs it.

> 
> 
>   Key for input dump annotations:
>   
>     - L:     labels input line number L
> 
> 
> In general, from your legend it's hard to figure out what line refers to what.
>  Here I assume that this item refers to line numbers from the matched files,
>  but it takes some guessing and looking at the output.

It says "input line", so I thought that would make it clear.

> 
> 
>   - T:L    labels the only match result for a pattern of type T from line L

Maybe it would be clearer if this were to say "checks line"?

> - T:L'N  labels the Nth match result for a pattern of type T from line L ```
> 
>   I do not understand what N'th match is.

It's an Nth match *result*.  For example, one result for a pattern might be that it has no match in a particular search range. Another result might be the fuzzy match that FileCheck sometimes reports after that. For CHECK-DAG and -vv, you can also have a series of discarded matches.

Should it say "diagnostic" instead of "match result"?

> 
> 
>   - ^~~    marks good match (requires -v)
>   - !~~    marks bad match
>    
> 
> I could not understand what "bad match" is, could only get it from a more detailed description later

Yes, I meant this as a high-level summary before you read all the details. I also thought it would be a nice reminder for someone who has read all this before. If you have suggestions on a better organization, please let me know.

> 
> 
>   - X~~    marks search range when no match
>    
> 
> "when no match is found" ?

Just trying to be succinct. I can spell it out if that helps.

> 
> 
>   - ?      marks fuzzy match when no match
>    
> 
> I don't understand this line. Is it a best-effort match when no match is found?

Yes. FileCheck already produces all these diagnostics. I'm just representing them as annotations.

> Where the question mark is situated then?

The start of the fuzzy match. FileCheck currently doesn't report the full range of a fuzzy match, and I didn't try to change that.

> 
> 
>   Detailed description of currently enabled markers:
> 
> 
> Should the description of markers be always duplicated?

I originally just had the detailed description but felt it was too intimidating, and I thought the short version before it helped to give a sense of the markers before the details. Again, I'm open to suggestions on how to improve it.

> 
> 
>   - ^~~    marks the final match for an expected pattern
>   - !~~    marks either:
>            - the final match for an excluded pattern
>            - the final but illegal match for an expected pattern
>    
> 
> The explanation is not clear. What is "final"?

Not a discarded match, which sometimes happens for CHECK-DAG.

> It's better to clarify that excluded means supplied with "NOT".

I was trying to keep it general in case we grow other directives with excluded patterns.

> It's not clear what "illegal" means here either.

CHECK-NEXT and CHECK-SAME can match patterns and then complain they're illegal.

For those cases, I'll add examples of directives that are affected.

> 
> 
>   - X~~    marks the search range for an unmatched expected pattern
>    
> 
> Where is X located? Just at the start of the range?

Yes. I thought it would be intuitive to LLVM developers (especially those who have read FileCheck's existing diagnostics, which already include `^~~`) that `X~~` and `!~~` mark ranges like `^~~` but for different purposes.

> 
> 
>   - ?      marks a fuzzy match start for an otherwise unmatched pattern
>    
> 
> What's the difference between `X` and a question mark?

Search range vs fuzzy match.

> 
> 
>   Full input was:
>   <<<<<<
>            1: ; abc def
> 
> 
> Line numbering may become ambiguous with the input, especially the space after the colon.

How? Every input line begins with a line number, colon, and space.  It's so unambiguous, you can write a simple sed expression to extract just the input text from the annotated dump. Maybe you're saying I should mention the space in the legend?

> Is line numbering required?

Without line numbering, you cannot always distinguish input lines and annotation lines.  Moreover, the line numbers help when you see a diagnostic before the dump and want to search for the corresponding line.

> Should there be a better separation?

I'm open to suggestions, but this seems to me like the most obvious way to do it.

> 
> 
>   chk:1         ^~~
> 
> 
> It's confusing that (from my understanding) line numbers above refer to line numbers in the input document,
>  but line numbers here refer to line numbers in the file with FileCheck directives.

That's true throughout existing FileCheck diagnostics, and I don't know what to do about it. The user must remember that input and checks are (sometimes) in different files.

> 
> 
>   sam:2             ^~~
> 
> 
> `sam` is not the best abbreviation for "SAME". Maybe spare another letter? Or use "sme" or something?

sme is fine by me.

> 
> 
>            2: ; ghI jkl
>   nxt:3'0     X~~~~~~~~
> 
> 
> I don't understand the semantics here. What's `'0`?

Diagnostic 0 for the CHECK-NEXT on line 3.

> Why `X` is below the semicolon?

Because that's the start of the search range.

> If it's always at the start of the line, should it be there at all?

Sometimes the search range doesn't start at the start of the line because the last match didn't end at the end of the previous line.

> 
> 
>   nxt:3'1       ?
> 
> 
> What is the purpose of this question mark? If we have already failed the search at this point because the previous pattern failed,
>  does it convey any information to put the question mark at the start of the line?

It's not at the start of the line. It's at the start of the fuzzy match, which is later.

> I don't understand what `'1` means here either.

Diagnostic 1 for the CHECK-NEXT on line 3.

>   >>>>>>
>    
>    $ cat checks1
>    CHECK: abc
>    CHECK-SAME: def
>    CHECK-NEXT: ghi
>    CHECK-SAME: jkl
>    
>    $ cat input1
>    ; abc def
>    ; ghI jkl
>    

Thanks for all your helpful comments. Please keep them coming, and I'll apply changes as we agree on them.


https://reviews.llvm.org/D52999





More information about the llvm-commits mailing list