[LLVMdev] [llvm-commits] Dealing with a corrupted /proc/self/exe link

Chandler Carruth chandlerc at google.com
Sat Jul 14 02:55:30 PDT 2012


On Sat, Jul 14, 2012 at 1:57 AM, Gabor Greif <gabor.greif at alcatel-lucent.com
> wrote:

> Chandler Carruth wrote:
> > On Fri, Jul 13, 2012 at 1:40 PM, Benjamin Kramer <benny.kra at gmail.com
> > <mailto:benny.kra at gmail.com>> wrote:
> >
> >
> >     On 13.07.2012, at 21:39, Gabor Greif <gabor.greif at alcatel-lucent.com
> >     <mailto:gabor.greif at alcatel-lucent.com>> wrote:
> >
> >     > Benjamin Kramer wrote:
> >     >> On 13.07.2012, at 09:46, Gabor Greif
> >     <gabor.greif at alcatel-lucent.com
> >     <mailto:gabor.greif at alcatel-lucent.com>> wrote:
> >     >>
> >     >>> Hi all,
> >     >>>
> >     >>> I am in charge of the controlled introduction of clang into
> >     >>> our builds at my workplace. Since all our tools must run from
> >     >>> a ClearCase view for automatic dependency tracking, we have been
> >     >>> biten by a Linux bug, and readlink("/proc/self/exe", ...) gives
> >     >>> nonsensical results. So we need to introduce a configure option
> >     >>> for disallowing this method of executable discovery (the other
> >     >>> one works well).
> >     >>
> >     >> Interesting, can you describe the linux bug? Are the kernel devs
> >     aware of it?
> >     >
> >     > It is fixed in newer RHEL kernels (>=6). What I know is that this
> is a
> >     > ClearCase VFS-related bug that fails to do a reverse mapping to
> obtain
> >     > the logical pathname from the real (into the backing store of
> >     ClearCase)
> >     > one.
> >     >
> >     > Here is a bug report:
> >     > <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6189256>
> >     >
> >     >>
> >     >> We often had reports about /proc/self/exe not working (and thus
> >     clang crashing)
> >     >> in chrooted environments. It is possible to mount /proc into the
> >     chroot but this
> >     >> seems to be missing from many setups. The code in LLVM that uses
> >     /proc/self/exe
> >     >> returns an empty string on error which confuses clang.
> >     >
> >     > There is no empty string for me, and the returned string is a real
> >     object
> >     > (bytewise identical to the real thing) :
> >     >
> >     > $ cd <into a dynamic view>
> >     > $ cp /bin/ls .
> >     > $ ls -l /proc/self/exe
> >     > lrwxrwxrwx 1 ggreif ocs 0 Jul 13 21:27 /proc/self/exe -> /bin/ls
> >     > $ ./ls -l /proc/self/exe
> >     > lrwxrwxrwx 1 ggreif ocs 0 Jul 13 21:27 /proc/self/exe ->
> >
> /vol/ocs_userviews25_13/ggreif-hc_stm-OCSnb28718.vws/.s/00056/800006ba4fdf647els
> >     >
> >     > $ diff ./ls
> >
> /vol/ocs_userviews25_13/ggreif-hc_stm-OCSnb28718.vws/.s/00056/800006ba4fdf647els
> >     > <no diffs>
> >     >
> >     > Unfortunately starting from the clang executable, there is no
> useful
> >     > directory structure to be discovered :-(
> >     >
> >     >>
> >     >> I don't really like having an autoconf switch for this as long as
> >     you can determine
> >     >> whether the result from /proc/self/exe is valid. When you're
> >     adding a fallback to
> >     >> Path.inc anyways, why not just try reading /proc/self/exe first,
> >     and if it fails, use
> >     >> your fallback? That would also fix the chroot problem.
> >     >
> >     > This is not a chroot problem. As shown above, I do not get a valid
> >     clang path
> >     > to manipulate and discover include directories, etc.
> >     >
> >     > The other method in lib/Support/Unix/Path.inc (i.e. dladdr,
> >     realpath) works.
> >     >
> >     > I still maintain that I need the configure option.
> >
> >     Sorry for being mean, but this is a workaround for a bug in the
> >     linux kernel that was
> >     fixed years ago and is only visible when using an obscure revision
> >     control system.
> >
> >     Also it requires rebuilding LLVM, so the fix isn't even helpful
> >     without researching the
> >     issue (if someone else hits it).
> >
> >     With this in mind I really don't see why this has to be in the
> >     public tree, requiring
> >     additions to two build systems. Can't you just apply the
> >     one-line-patch to Path.inc
> >     locally?
> >
> >
> > I agree, this patch as is doesn't belong in the tree...
>
> Hi Chandler,
>
> yes, the audience is rather narrow (i.e. 'us' :-)
>
> >
> > However, I suspect that Clang already hase the capability to solve this
> > problem for you.
>
> Ok, good to hear.
>
> >
> > For context, we run Clang in a distributed build environment not
> > dissimilar to the one you are describing, and for us as well
> > /proc/self/exe does not really help us locate the Clang binary. There is
> > a switch available (-no-canonical-prefixes) which in conjunction with
> > some other things should use the value of argv[0] in main to locate the
> > clang binary, not /proc/self/exe or anything else.
>
> I shall read more on this in the code and experiment around a bit.
> Is this way configurable, or a switch to clang? Clearly the former
> would be better.
>

It's a flag to Clang. I really dislike configure switches, and generally
push for Clang to avoid them when at all possible. It makes both testing
and supporting users much easier.

In particular, as the only groups to truly need this behavior are build
systems which manage the file content trees specially, it seems reasonable
for those build systems to pass the appropriate flags to Clang.

I gave you the flag name above, so please give it a spin.



>
> >
> > Can you describe why it is that Clang is reading /proc/self/exe? We
> > might be able to change that in a principled way to support numerous
> > different filesystem layouts and scenarios where its results are correct
> > but not helpful for locating executable-relative directory structures.
>
> $ echo "int main(){return 0;}" > ttt.c
> $ gdb Release+Asserts/bin/clang
>
> Reading symbols from /home/ggreif/llvm/Release+Asserts/bin/clang...(no
> debugging symbols found)...done.
>

Err, could you use a debug build please? =[ The information below doesn't
help much because...


>
> (gdb) b dladdr
> Breakpoint 1 at 0x5d0d58
>
> (gdb) run -c ttt.c
> Starting program: /home/ggreif/llvm/Release+Asserts/bin/clang -c ttt.c
> warning: no loadable sections found in added symbol-file system-supplied
> DSO at 0x2aaaaaaab000
> [Thread debugging using libthread_db enabled]
>
> Breakpoint 1, 0x0000003d61e01710 in dladdr () from /lib64/libdl.so.2
> (gdb) bt
> #0  0x0000003d61e01710 in dladdr () from /lib64/libdl.so.2
> #1  0x00000000019d554d in llvm::sys::Path::GetMainExecutable(char const*,
> void*) ()
> #2  0x00000000005d8882 in main ()


... main doesn't call GetMainExecutable. Inlining and a bunch of other
stuff has happened here.

Anyways, I know this code. You could probably find it yourself. If you add
line numbers to your build, you'll get a stack trace pointing you to
tools/driver/driver.cpp:56 here, where we call GetMainExecutable. If you
read lines 50 and 51, you'll see the logic I described where if
-no-canonical-prefixes is used, we instead trust argv[0] (spelled by a
different name, look at the caller to see the gory details).


But there are a *lot* of ways that Clang will misbehave when run in a
heavily symlinked (or equivalent synthetic VFS) tree unless you pass this
flag. That's why it exists in both Clang and GCC. Let me know if you still
see trouble when using it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120714/fee2f85f/attachment.html>


More information about the llvm-dev mailing list