[PATCH] D72357: Fix llvm-objcopy when directory contains "bar".

Fri Jan 24 19:53:53 PST 2020

dblaikie added inline comments.

================
Comment at: llvm/test/tools/llvm-objcopy/ELF/only-section-strip-undefined.test:6
 # RUN: llvm-objcopy -j .other.section %t.o %t2.o
-# RUN: llvm-readobj --symbols %t2.o | FileCheck %s --implicit-check-not=bar
+# RUN: llvm-readobj --symbols %t2.o | FileCheck %s --implicit-check-not=barrr

----------------
MaskRay wrote:
> MaskRay wrote:
> > dblaikie wrote:
> > > MaskRay wrote:
> > > > dblaikie wrote:
> > > > > rupprecht wrote:
> > > > > > MaskRay wrote:
> > > > > > > jhenderson wrote:
> > > > > > > > rupprecht wrote:
> > > > > > > > > MaskRay wrote:
> > > > > > > > > > MaskRay wrote:
> > > > > > > > > > > rupprecht wrote:
> > > > > > > > > > > > MaskRay wrote:
> > > > > > > > > > > > > `--implicit-check-not='Name: bar'`
> > > > > > > > > > > > I assume the match you're getting is the first line from llvm-readobj:
> > > > > > > > > > > > 
> > > > > > > > > > > > ```
> > > > > > > > > > > > File: /Users/jlebar/path/to/something.o
> > > > > > > > > > > > Format: ELF64-x86-64
> > > > > > > > > > > > Arch: x86_64
> > > > > > > > > > > > AddressSize: 64bit
> > > > > > > > > > > > ```
> > > > > > > > > > > > 
> > > > > > > > > > > > Piping from stdin will remove that filename; can you instead try:
> > > > > > > > > > > > ```
> > > > > > > > > > > > # RUN: llvm-readobj --symbols < %t2.o | FileCheck %s --implicit-check-not=bar
> > > > > > > > > > > > ```
> > > > > > > > > > > Using `< file` input redirection may leave us in a very inconvenient situation. Maybe we should just be more careful when using `--implicit-check-not`, instead of replacing every single use of `llvm-readobj %t` with `llvm-readobj < %t`.
> > > > > > > > > > I'd like to hear @jhenderson what wants to say here.
> > > > > > > > > When you say it's inconvenient, can you clarify why?
> > > > > > > > > 
> > > > > > > > > The practice of using piping to avoid the filename present is documented here, albeit for a different tool (opt instead of llvm-readobj): http://llvm.org/docs/TestingGuide.html#fragile-tests
> > > > > > > > This isn't likely to affect too many people, but if a test redirects from stdin, it is a pain to rerun a part of it in my default natural environment, Windows PowerShell, because PowerShell doesn't support the '<' redirection command for some unknown reason.
> > > > > > > > 
> > > > > > > > MaskRay's suggestions in this test would seem more preferable to me therefore, and less fragile for other reasons too, I reckon: "Name: bar" is much less likely to appear anywhere in the output beyond the bit being checked. Also, "bar" is a common enough three-letter pattern to appear in another word that even shell redirection won't necessarily solve it in all cases or could prove flaky in the future if more output is added to llvm-readobj.
> > > > > > > @rupprecht Do we have a consensus now?
> > > > > > > This isn't likely to affect too many people, but if a test redirects from stdin, it is a pain to rerun a part of it in my default natural environment, Windows PowerShell, because PowerShell doesn't support the '<' redirection command for some unknown reason.
> > > > > > > 
> > > > > > 
> > > > > > There seems to be thousands of llvm-based tests already that use use < as a pipe. It's a shame PowerShell doesn't support it.
> > > > > > PowerShell does seem to have a replacement called `Get-Content`. I wonder if we could have lit stdout rewrite the RUN commands so "foo < x y z" is printed out as "Get-Content x | foo y z"?
> > > > > > 
> > > > > > ```
> > > > > > $ find . -type f -path '*test*' -print0 | xargs -0 grep -l 'RUN.*llvm-.*<' | wc -l
> > > > > > 3704
> > > > > > ```
> > > > > > 
> > > > > > PowerShell does seem to support the pipe operator, so `cat %t2.o | llvm-readobj --symbols` might also work, although I only see one test that uses cat this way. (I see lots of `cat ... | FileCheck`, but nothing piping into a tool). I think if we want to support PowerShell we should do something about the thousands of tests and the documentation using PowerShell-incompatible features.
> > > > > > 
> > > > > > Anyway... I'm still slightly leaning toward `RUN: llvm-readobj --symbols < %t2.o` because (1) it matches what the rest of llvm testing does, and (2) it actually prevents full filenames from leaking into the tool output, so you don't have to play games with more advanced FileCheck options. (There is a risk that the filename might find some other way to creep in, but I think that's mostly theoretical).
> > > > > > OTOH, I also don't want these tests to be failing for users based on their username, so either patch is better than nothing & perfect should not be the enemy of good and all that, so if nobody else likes `<` then feel free to submit whichever patch you like.
> > > > > Yeah - please stick with "<" given the precedent in the project already. If there's a systemic issue with that that could be addressed by a similarly legible alternative or improvements to Lit, I think that's worth discussing separately.
> > > > > 
> > > > > (though it's not clear to me if @MaskRay's "Using < file input redirection may leave us in a very inconvenient situation. " refers to the issue that @jhenderson mentioned, or was alluding to some other, as yet unclarified, inconvenience - if there's some other unknowns here, happy to discuss them - though, again, given the precedent, please go ahead with using "<" (if it works as well here as in the many other tests it's used in) and start a separate llvm-dev discussion, perhaps, on addressing those other issues)
> > > > I should have been clearer. Some GNU binutils (nm, objdump, size) have a default file name: "a.out", when no filename is specified. llvm-nm, llvm-objdump, llvm-size and llvm-dwarfdump learned this rule from GNU binutils. I have a weak preference not doing this, but I don't have a strong enough motivation to change it, and I feel someone may report bugs if LLVM binary utilities don't have the default file name.
> > > > 
> > > > For such utilities, `llvm-nm < %s` will not read from stdin. If we want to change some utilities (llvm-objcopy in question) to follow the convention of traditional LLVM tools (llvm-as/llc/opt/etc), the aforementioned tools will inevitably cause inconsistency which cannot be fixed.
> > > > 
> > > > A balance between "don't let users report bugs" and "don't let such behavior get in our way" is not changing our current practice, but making the tests a bit more robust (D72358). By doing that, we can circumvent the problem we encountered by @jlebar and @bbaren. I did worry whether that would bring more mental burden when writing tests, but from my experience in D72358 it does not look that bad.
> > > If any of our tools have such default behavior and need it, they should have that behavior - our testing strategy shouldn't dictate feature usability.
> > > 
> > > But I'd still advocate for the same testing strategy using such a feature, using "toolname - < inputname" to tell the tool to read from stdin, then pipe the file into stdin. The existence of a "no file name means a.out" doesn't change the general approach, in my opinion.
> > > 
> > > I'd rather not circumvent only the current problem - all sorts of other substrings may appear in a user's path that might confuse tests & we should avoid having the path in the test at all. (yeah, we don't do this for all tests - but I think when it comes up it's worth fixing firmly & avoiding the input name appearing anywhere in the checked output)
> > ```
> > % size - < a
> > size: '-': No such file
> > ```
> > 
> > Well, this can be a feature request to GNU binutils.
> > 
> > Doing things differently for `llvm-size - < a` and `llvm-readobj -d < a` seem a bit inconvenient.
> > 
> > For consistency, we may need to change every binary utilities use case to either `- < a` or `< a`. I acknowledge the filename-in-output problem, but am worried that the proposed fix is causing too much burden.
> We have 700+ occurrences of llvm-objcopy, nears 4000 occurrences of llvm-readobj in tests. Very few of them have the problem. Adapting every test case to be consistent with 2 llvm-objcopy tests is too big a hammer.
I'm not suggesting changing all occurrences - though I am suggesting that piping input is best practice in LLVM for writing input tests (& this extends beyond objcopy and readobj - similarly for LLVM IR tests too, etc - piping input ensures the on-disk path doesn't appear in the output in a number of different tools) & the preferred solution when tests end up having these sorts of problems.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72357/new/

https://reviews.llvm.org/D72357