[cfe-dev] clang building linux
Dale Johannesen
dalej at apple.com
Tue Oct 26 10:43:47 PDT 2010
Please do file bug reports on compiler problems. Sometimes they get fixed.
On Oct 26, 2010, at 6:46 AMPDT, pageexec at freemail.hu wrote:
> hello folks,
>
> given the recent interest both on the list and elsewhere in building a working
> linux kernel, here's my 2 cents. i began this work some half a year ago when
> 2.7 came out but got held up by other projects so i could only finish it recently.
>
> my approach is different from others who have been working on this in that i
> went for patching linux itself in order to compile and link with clang properly.
> it turns out that with a hundred or so lines patched in linux and a recent clang
> (read: use svn HEAD) it's very easy to build a working kernel now. obviously some
> of these patches are workarounds for features lacking in clang so the right
> approach there is to change clang. some patches are needed for linux bugs, there's
> nothing clang can (or should) do about them i think. here's a summary of the issues
> i ran into in no particular order:
>
> 1. early boot code and .codegcc16/mregparm
>
> i'm not sure if it's codegcc16 or not, but something makes clang ignore
> -mregparm when compiling the early linux boot code so there'll be a mismatch
> between how arguments are passed from C code and how assembly code expects
> them. the workaround is to explicitly annotate some functions with the attribute.
>
> 2. probably related to the above, __builtin_memcpy and __builtin_memset also
> ignore -mregparm and cause the same kind of trouble at runtime so i worked it
> around by using explicit inline asm.
>
> 3. sse code in kernel
>
> in general linux is already built with -mno-sse and others but some Makefiles
> such as the x86 boot code forget to use it with bad consequences for early boot
> (read: the kernel doesn't even decompress ;).
>
> 4. unused variable/function elimination
>
> it seems that clang is more aggressive than gcc and eliminates more actually
> required data/code than desired. earliest causalty is the boot code as usual
> but there're also some module parameter related structures affected. the fix
> is needed on the linux side of course.
>
> 5. asm 'p' constraint
>
> this was fixed last week in subversion, so i'm omitting the patch for it, but
> if someone really wants to use an earlier clang (such as the 2.8 release), then
> just duplicate the percpu_read macro into percpu_read_stable.
>
> 6. .gnu.linkonce.d.* section usage
>
> it seems that clang can emit code/data into sections that the linux linker
> scripts were not aware of.
>
> 7. extern and __attribute__((visibility("hidden"))) usage in the vdso
>
> it seems that this construct doesn't work with clang so i worked it around for
> now by abusing the weak attribute and the linker's ability to merge such symbols.
>
> 8. const merging in the vdso
>
> possibly related to the above, the linker(?) merges const variables when their
> value is the same which, while technically correct, defeats some self-checking
> code in the vdso so i had to deconstify the affected variables.
>
> 9. lack of __label__ support
>
> linux needs this for implementing an arch-independent way to acquire the current
> program counter or something close to it at least, for now the workaround is an
> arch specific inline asm block.
>
> 10. clang crash on __verify_pcpu_ptr use
>
> when compiling i think init/main.c, clang crashes on the above macro. i tried to
> extract a minimal example but that failed to produce any errors, so probably there
> is more context needed to trigger the segfault. interestingly, the workaround for
> getting this compiled was to turn the body of the macro into a statement expression
> but otherwise it's the same code inside.
>
> 11. excessive inlining and stack usage
>
> while apparently gcc and clang make different inlining decisions, they're both
> bad at reusing the stack for the local variables of the inlined functions and
> sometimes produce high stack usage. linux already has an explicit way to prevent
> such undesired inlining, i just had to annotate a few more functions (but it's
> not meant to be exhaustive, it's based on my own config only).
>
> 11. uninitialized variable handling
>
> this one was a fun one to debug (no :P). apparently the getdents code computes
> a structure offset by computing a pointer difference - where the pointer in
> question is uninitialized. gcc seemingly manages to produce the desired offset
> whereas clang produces a 0 for the uninitialized pointers and hence for their
> difference as well, resulting in getdents not returning any entries in this
> particular case. very funny when you enter a directory but cannot list its
> content, although initramfs scripts tend not to appreciate it :). fortunately
> clang --analyze warns about such problems but then it crashes on a few more
> constructs so it's not an entirely painless exercise to go through the whole
> tree looking for such uninitialized variable usage (i checked most things but
> drivers/ and the non-x86 arch subtrees).
>
> 12. variable length arrays in crypto/netfilter/crc
>
> this is an already known issue (in that clang is not going to support this
> gcc extension), so the workaround/fix was to rewrite the linux code.
>
> 13. ignoring -fcall-saved-xxx
>
> it seems that clang for some reason ignores -fcall-saved-xxx and miscompiles some
> code relying on it (lib/hweight.c) so as a workaround i removed this optimization
> from linux but obviously clang should be fixed instead.
>
> beyond the above fixes here and there, there're some opportunities to make better
> use of clang specific features as well, so if anyone feels inclined... ;)
>
> 14. clang's address_space attribute extension
>
> this would probably allow to simplify all the x86 per-cpu accessors (ditto
> for userland btw).
>
> 15. fix analyzer crashes
>
> as i mentioned above, there're a few constructs that make the analyzer crash
> on the linux tree, it'd probably be easy to fix them for someone familiar with
> the internals. the easiest way to run the analyzer (and to reproduce the problems)
> is to issue make CC=.../clang C=2 CHECK="clang --analyze" .
>
> 16. fix issues found by clang --analyze
>
> this is a bigger undertaking as the false positive ratio is quite low in my
> experience and there're many issues it finds (mostly unused variables or useless
> variable writes that sometimes can point at deeper issues such as not doing
> anything with error return values but i saw also potential NULL derefs).
>
> 17. extend the analyzer to understand the sparse defines
>
> sparse is a standalone static analyzer built for linux and several important
> subsystems have already been properly marked up for sparse analysis so it'd be
> nice if clang could make use of this information (in fact, some analysis could
> probably be done at normal compile time already since the checks are cheap).
>
>
> cheers,
> PaX Team
>
> The following section of this message contains a file attachment
> prepared for transmission using the Internet MIME message format.
> If you are using Pegasus Mail, or any other MIME-compliant system,
> you should be able to save it or view it from within your mailer.
> If you cannot, please ask your system administrator for assistance.
>
> ---- File information -----------
> File: pax-linux-2.6.35.7-test25-clang-only.patch
> Date: 25 Oct 2010, 22:17
> Size: 29747 bytes.
> Type: Unknown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pax-linux-2.6.35.7-test25-clang-only.patch
Type: application/octet-stream
Size: 29747 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20101026/932e0e75/attachment.obj>
-------------- next part --------------
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
More information about the cfe-dev
mailing list