[PATCH] Fix dependency file escaping

Paul Robinson Paul_Robinson at playstation.sony.com
Wed Apr 22 18:51:54 PDT 2015


================
Comment at: lib/Frontend/DependencyFile.cpp:298
@@ -295,3 +297,3 @@
   for (unsigned i = 0, e = Filename.size(); i != e; ++i) {
-    if (Filename[i] == ' ' || Filename[i] == '#')
+    if (Filename[i] == ' ' || Filename[i] == '#') {
       OS << '\\';
----------------
silvas wrote:
> probinson wrote:
> > silvas wrote:
> > > Why `#` also? We are trying to be gcc-compatible. See https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libcpp/mkdeps.c;h=78bbf09a0cc8c3ba0fe974b55e1ededf02c66665;hb=HEAD#l47
> > No, we're trying to produce something that GNU Make can use.  What gcc produces is broken by any definition.
> > 
> > $ cat tspace.c
> > #include "foo\ bar.h"
> > #include "foo\#bar.h"
> > int i;
> > $ ls
> > foo\ bar.h  foo\#bar.h  tspace.c
> > $ gcc -c -MMD tspace.c
> > $ cat tspace.d
> > tspace.o: tspace.c foo\\\ bar.h foo\\#bar.h
> > $ make -f tspace.d tspace.o
> > make: *** No rule to make target `foo\', needed by `tspace.o'.  Stop.
> > $ 
> > 
> Please file a bug with gcc.
> 
> Also, curiously that .d file seems to work for me on windows under cygwin:
> 
> ```
> Sean at Sean-PC ~
> $ cat >tspace.c
> #include "foo\ bar.h"
> #include "foo\#bar.h"
> int i;
> 
> Sean at Sean-PC ~
> $ mkdir foo
> 
> Sean at Sean-PC ~
> $ cd foo
> 
> Sean at Sean-PC ~/foo
> $ mkdir ' bar.h'
> 
> Sean at Sean-PC ~/foo
> $ mkdir '#bar.h'
> 
> Sean at Sean-PC ~/foo
> $ cd ..
> 
> Sean at Sean-PC ~
> $ cat >tspace.d
> tspace.o: tspace.c foo\\\ bar.h foo\\#bar.h
> 
> Sean at Sean-PC ~
> $ make -f tspace.d
> cc    -c -o tspace.o tspace.c
> make: cc: Command not found
> <builtin>: recipe for target 'tspace.o' failed
> make: *** [tspace.o] Error 127
> ```
> 
> Also, clang's current output seems to work fine on cygwin. In what cases is clang's output currently breaking? Is it just the case of filenames containing backslashes followed by space/hash on unix?
> 
> The set of strings that can be correctly escaped in make is inherently limited, so we are inherently compromising on what file names can be properly represented. Does it make sense to extend the set of properly representable filenames to files containing backslashes in their name?
> 
> E.g. include "foo\ bar.h". On unix we would be instructing make to look for a file 'foo\ bar.h' in the current directory but on windows in my testing above we would be instructing make to look for a file ' bar.h' in directory 'foo'. Any project for which that would actually be correct behavior is insane. I think that realistically, backslashes are only going to end up in includes on windows where they will be intended as directory separators, and will result in compilation errors on other platforms. I believe that clang's current output actually works correctly in that case.
> 
> We might be overthinking this. What case are we fixing? And does it make sense to fix that case?
Cygwin is not doing the right thing for me.  Using Clang's depfile:
{code}
probinson at PROBINSON-T3610 ~
$ /cygdrive/e/upstream/build/debug/bin/clang -c -MMD tspace.c

probinson at PROBINSON-T3610 ~
$ cat tspace.d
tspace.o: tspace.c foo\\ bar.h foo\\#bar.h

probinson at PROBINSON-T3610 ~
$ touch 'foo\#bar.h'

probinson at PROBINSON-T3610 ~
$ make -f tspace.d tspace.o
make: *** No rule to make target 'bar.h', needed by 'tspace.o'.  Stop.
{code}
Escaping the space is no good unless you escape the preceding backslash.

With gcc's depfile:
{code}
probinson at PROBINSON-T3610 ~
$ cat tspace.d
tspace.o: tspace.c foo\\\ bar.h foo\\#bar.h

probinson at PROBINSON-T3610 ~/foo
$ touch 'foo/#bar.h'

probinson at PROBINSON-T3610 ~
$ make -f tspace.d tspace.o
make: 'tspace.o' is up to date.
{code}
I suspect it's looking at directory foo, and treating #bar.h as a comment.

But if I add the extra backslash manually, it works correctly.
{code}
probinson at PROBINSON-T3610 ~
$ cat tspace.d
tspace.o: tspace.c foo\\\ bar.h foo\\\#bar.h

probinson at PROBINSON-T3610 ~
$ touch 'foo/#bar.h'

probinson at PROBINSON-T3610 ~
$ make -f tspace.d tspace.o
cc    -c -o tspace.o tspace.c
{code}

So, what am I fixing?  If we're going to do anything special with space and hash, then we should do a special thing that causes them to work properly with a tool that knows how to interpret the special thing.  The only tool I have handy that is willing to do a special thing with space and hash is GNU Make (on Linux or Windows).  The special thing to do is backslash-quote those characters. But that doesn't work in GNU Make unless you also backslash-quote the preceding backslashes; and that characteristic is the same for quoting either space or hash.

Therefore, what clang produces won't work at all, and what gcc produces will work only for space and not for hash.  My fix makes Clang's output work for both space and hash.  Provably.

I freely admit--always have, I think--that this is a rather obscure corner case that just isn't going to happen in practice. But it's provably wrong, and my fix provably corrects it.  This is a problem?

Alternatively, we can just abandon any special processing for space and hash, and say Don't Do That.  But leaving this thing in its current provably wrong state just leaves me professionally offended.

http://reviews.llvm.org/D9208

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the cfe-commits mailing list