[PATCH] Fix dependency file escaping
Paul Robinson
Paul_Robinson at playstation.sony.com
Thu Apr 23 12:48:12 PDT 2015
================
Comment at: lib/Frontend/DependencyFile.cpp:298
@@ -295,3 +297,3 @@
for (unsigned i = 0, e = Filename.size(); i != e; ++i) {
- if (Filename[i] == ' ' || Filename[i] == '#')
+ if (Filename[i] == ' ' || Filename[i] == '#') {
OS << '\\';
----------------
probinson wrote:
> silvas wrote:
> > probinson wrote:
> > > silvas wrote:
> > > > Why `#` also? We are trying to be gcc-compatible. See https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libcpp/mkdeps.c;h=78bbf09a0cc8c3ba0fe974b55e1ededf02c66665;hb=HEAD#l47
> > > No, we're trying to produce something that GNU Make can use. What gcc produces is broken by any definition.
> > >
> > > $ cat tspace.c
> > > #include "foo\ bar.h"
> > > #include "foo\#bar.h"
> > > int i;
> > > $ ls
> > > foo\ bar.h foo\#bar.h tspace.c
> > > $ gcc -c -MMD tspace.c
> > > $ cat tspace.d
> > > tspace.o: tspace.c foo\\\ bar.h foo\\#bar.h
> > > $ make -f tspace.d tspace.o
> > > make: *** No rule to make target `foo\', needed by `tspace.o'. Stop.
> > > $
> > >
> > Please file a bug with gcc.
> >
> > Also, curiously that .d file seems to work for me on windows under cygwin:
> >
> > ```
> > Sean at Sean-PC ~
> > $ cat >tspace.c
> > #include "foo\ bar.h"
> > #include "foo\#bar.h"
> > int i;
> >
> > Sean at Sean-PC ~
> > $ mkdir foo
> >
> > Sean at Sean-PC ~
> > $ cd foo
> >
> > Sean at Sean-PC ~/foo
> > $ mkdir ' bar.h'
> >
> > Sean at Sean-PC ~/foo
> > $ mkdir '#bar.h'
> >
> > Sean at Sean-PC ~/foo
> > $ cd ..
> >
> > Sean at Sean-PC ~
> > $ cat >tspace.d
> > tspace.o: tspace.c foo\\\ bar.h foo\\#bar.h
> >
> > Sean at Sean-PC ~
> > $ make -f tspace.d
> > cc -c -o tspace.o tspace.c
> > make: cc: Command not found
> > <builtin>: recipe for target 'tspace.o' failed
> > make: *** [tspace.o] Error 127
> > ```
> >
> > Also, clang's current output seems to work fine on cygwin. In what cases is clang's output currently breaking? Is it just the case of filenames containing backslashes followed by space/hash on unix?
> >
> > The set of strings that can be correctly escaped in make is inherently limited, so we are inherently compromising on what file names can be properly represented. Does it make sense to extend the set of properly representable filenames to files containing backslashes in their name?
> >
> > E.g. include "foo\ bar.h". On unix we would be instructing make to look for a file 'foo\ bar.h' in the current directory but on windows in my testing above we would be instructing make to look for a file ' bar.h' in directory 'foo'. Any project for which that would actually be correct behavior is insane. I think that realistically, backslashes are only going to end up in includes on windows where they will be intended as directory separators, and will result in compilation errors on other platforms. I believe that clang's current output actually works correctly in that case.
> >
> > We might be overthinking this. What case are we fixing? And does it make sense to fix that case?
> Cygwin is not doing the right thing for me. Using Clang's depfile:
> {code}
> probinson at PROBINSON-T3610 ~
> $ /cygdrive/e/upstream/build/debug/bin/clang -c -MMD tspace.c
>
> probinson at PROBINSON-T3610 ~
> $ cat tspace.d
> tspace.o: tspace.c foo\\ bar.h foo\\#bar.h
>
> probinson at PROBINSON-T3610 ~
> $ touch 'foo\#bar.h'
>
> probinson at PROBINSON-T3610 ~
> $ make -f tspace.d tspace.o
> make: *** No rule to make target 'bar.h', needed by 'tspace.o'. Stop.
> {code}
> Escaping the space is no good unless you escape the preceding backslash.
>
> With gcc's depfile:
> {code}
> probinson at PROBINSON-T3610 ~
> $ cat tspace.d
> tspace.o: tspace.c foo\\\ bar.h foo\\#bar.h
>
> probinson at PROBINSON-T3610 ~/foo
> $ touch 'foo/#bar.h'
>
> probinson at PROBINSON-T3610 ~
> $ make -f tspace.d tspace.o
> make: 'tspace.o' is up to date.
> {code}
> I suspect it's looking at directory foo, and treating #bar.h as a comment.
>
> But if I add the extra backslash manually, it works correctly.
> {code}
> probinson at PROBINSON-T3610 ~
> $ cat tspace.d
> tspace.o: tspace.c foo\\\ bar.h foo\\\#bar.h
>
> probinson at PROBINSON-T3610 ~
> $ touch 'foo/#bar.h'
>
> probinson at PROBINSON-T3610 ~
> $ make -f tspace.d tspace.o
> cc -c -o tspace.o tspace.c
> {code}
>
> So, what am I fixing? If we're going to do anything special with space and hash, then we should do a special thing that causes them to work properly with a tool that knows how to interpret the special thing. The only tool I have handy that is willing to do a special thing with space and hash is GNU Make (on Linux or Windows). The special thing to do is backslash-quote those characters. But that doesn't work in GNU Make unless you also backslash-quote the preceding backslashes; and that characteristic is the same for quoting either space or hash.
>
> Therefore, what clang produces won't work at all, and what gcc produces will work only for space and not for hash. My fix makes Clang's output work for both space and hash. Provably.
>
> I freely admit--always have, I think--that this is a rather obscure corner case that just isn't going to happen in practice. But it's provably wrong, and my fix provably corrects it. This is a problem?
>
> Alternatively, we can just abandon any special processing for space and hash, and say Don't Do That. But leaving this thing in its current provably wrong state just leaves me professionally offended.
>
And why I want to fix this... aside from the professionally-offended part... :-)
We are looking at modifying the PS4 toolchain's builder to move away from double-quotes, to doing things the GNU-Make style way. But I don't want to require our builder to correctly interpret broken depfiles; Clang should be producing correctly formed depfiles, and our builder should be able to depend on depfiles being correctly formed.
http://reviews.llvm.org/D9208
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the cfe-commits
mailing list