[cfe-dev] linux/i386 and mregparm

pageexec at freemail.hu pageexec at freemail.hu
Mon Feb 28 12:51:42 PST 2011


Hi folks,

continuing the effort of getting clang to compile the linux kernel, i recently
got it to work on i386 as well. in the following i'll describe some issues i
ran into.

1. patching linux

the first attached patch is against linux 2.6.36.4 (well, it's against the
PaX patch but should mostly apply to vanilla as well). it should also work
on amd64 but i didn't try it this time. compared to a few months ago, there
are some changes needed on both sides again.

on the linux side, clang (albeit inadvertantly) caught a bug where a function
declaration had a different section attribute than its definition and apparently
clang takes the former while gcc takes the latter into account in the end (and
neither warned about the fact ;). other changes in linux were needed due to clang
bugs and features that i'll elaborate on in the next sections.

2. patching clang

the second attached patch is against r126479 and is needed to fix a bug that
was introduced, or more precisely, unearthed by the recent -mregparm support
patch. in particular, before -mregparm was supported, the only way to change
a function's parameter passing convention was to use a function attribute, so
in lieu of such an attribute the default regparm=0 was used.

consequently, specifying an explicit regparm=0 attribute had no real effect
and clang had no special treatment for this case. now with -mregparm the default
regparm value can be changed for a given compilation unit therefore the old
assumption that the default regparm value is 0 no longer holds. this becomes
a problem when one uses a non-0 mregparm and wants to override it to 0 for some
functions.

just such a case arises with linux (due to another clang/llvm bug/feature, see
below) so this needed fixing to get a proper kernel. i hope i got the implementation
right, and would appreciate help with writing a test case as i'm not familiar
with the test system.

3. builtin functions and regparm interaction

the above mentiond bug/feature is documented in bug #3997 already, basically
the issue is that when llvm emits builtin functions calls, it has to use some
calling convention and arguably the desired regparm value may not be available
or even well defined at that stage. to this i'd like to add two observations:

 - with mregparm support it'd be possible to pass this value down to the llvm
   layer and take it into account when generating said functions calls,

 - gcc does this already albeit it gives a warning when in hosted mode, but not
   in freestanding mode

since for now builtins are always emitted with regparm=0, normal callers must
follow the same so in linux these functions need an explicit attribute to override
the mregparm=3 as used for everything else in the kernel. this is manageable but
was somewhat annoying to debug one by one ;) and also i don't think it'd be accepted
upstream.

4. unimplemented gcc command line switches

there're some switches that are used by the linux Makefiles but not yet implemented
in clang. beyond the noise (unless surpressed with -Qunused-arguments) there's actually
a problem caused by one of them: -fno-optimize-sibling-calls. it appears that this
optimization cannot be controlled in llvm and as a result some assumptions that the
kernel makes about the call chain depth at certain points will be invalid (some
tracing/logging code wants to look up the callers 2-3 levels up and due to this
optimization it can result in the code dereferencing userland frame pointer values).
for now i worked this around at the few places i ran into so far, but this should be
fixed in llvm and clang i think.

5. -Wformat false positives

there's some recent change here which makes clang complain about a lot of format
strings that are valid in linux (being a freestanding environment and sporting its
own format string parser, not to mention extensions). i don't know what the right
solution here would be though, but this causes lots of noise during compilation
(and turning the warning off may miss real problems).

6. weak functions and optimization

it appears to be yet another bug in that when a weak function with an empty body
is encountered in a compilation unit, the optimizer assumes that that's there is
to this function and omits passing arguments to calls to the weak function (and
presumably non-empty bodies would trigger other kind of optimizations not necessarily
valid for the overrides).

obviously this is incorrect since the whole point of a weak function is that it
can be overridden in another compilation unit and therefore no assumptions can be
made about it in the optimizer. this particular problem arises in linux in a few
places as weak functions are sometimes used to implement arch specific overrides.

7. bounds checking false positives

this came up in the signal handling code, in particular there's a _NSIG_WORDS define
that's used like this:

   switch (_NSIG_WORDS) {
   case 4: /*...*/
   case 2: /*...*/
   case 1: /*...*/
   }

and the different cases index into an array. now the problem is that _NSIG_WORDS
is 2 for i386 but clang still evaluates case 4 and warns about the out-of-bounds
array accesses in there. a similar false positive arises in expressions like this:

  sizeof(long) == 8 ? /*...*/ : /*...*/ ;

where the code meant for the 64 bit archs gets evaluated even on 32 bit archs
and usually gives some warning, depending on the exact statements used. it'd be
nice to fix this somehow.

8. rip relative addressing in mcmodel=kernel

this is amd64 related but i thought i'd mention it here. the third attached patch
allows llvm to generate rip relative accesses for kernel mode code as well (this
is what gcc does too), and this in turn reduces the size of a relocatable kernel.

this may be a linux specific feature, basically as its name says, this allows the
kernel image to be loaded at a suitably aligned but otherwise arbitrary address in
memory where the kernel will relocate itself (some post-link processing collects
and creates a special section with relocation info, its size can be reduced with
rip relative addressing).

about the the commented out chunk in X86::isOffsetSuitableForCodeModel, i'm not
sure if such checking makes sense for kernel mode, so as a quick hack i just got
rid of it, but i don't know what the right solution there would be.

also i was lazy to separate it out, the llvm Makefile patch simply makes the svn
update process more verbose so that one can actually see which module is being
updated, feel free to ignore it but i found it useful for myself ;).

9. integrated-as support and linux

last but not least, it'd be nice one day to allow the use of integrated-as with
linux as well, but that requires implementing support for a few directives, i
recall pushsection/popsection at least but there was also some issue with the
assembler not detecting the proper size for bit operations (even though it could
have deduced them from the arguments like it does for some insns already).

so that's all in a nutshell, i'm now wondering if someone could tell me
  - if the mregparm fix is acceptable and can make it into 2.9
  - which problems should get a bugzilla entry (i.e., they'll be fixed one day)
  - what to do with the issues that are perhaps considered as features ;)

cheers,

 PaX Team

-------------- next part --------------
The following section of this message contains a file attachment
prepared for transmission using the Internet MIME message format.
If you are using Pegasus Mail, or any other MIME-compliant system,
you should be able to save it or view it from within your mailer.
If you cannot, please ask your system administrator for assistance.

   ---- File information -----------
     File:  pax-linux-2.6.36.4-test21-clang-only.patch
     Date:  25 Feb 2011, 23:03
     Size:  49819 bytes.
     Type:  Unknown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pax-linux-2.6.36.4-test21-clang-only.patch
Type: application/octet-stream
Size: 49819 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20110228/366ed2f9/attachment.obj>
-------------- next part --------------
The following section of this message contains a file attachment
prepared for transmission using the Internet MIME message format.
If you are using Pegasus Mail, or any other MIME-compliant system,
you should be able to save it or view it from within your mailer.
If you cannot, please ask your system administrator for assistance.

   ---- File information -----------
     File:  clang-mregparm-fix.patch
     Date:  28 Feb 2011, 11:03
     Size:  11293 bytes.
     Type:  Unknown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clang-mregparm-fix.patch
Type: application/octet-stream
Size: 11293 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20110228/366ed2f9/attachment-0001.obj>
-------------- next part --------------
The following section of this message contains a file attachment
prepared for transmission using the Internet MIME message format.
If you are using Pegasus Mail, or any other MIME-compliant system,
you should be able to save it or view it from within your mailer.
If you cannot, please ask your system administrator for assistance.

   ---- File information -----------
     File:  llvm-amd64-kernel-pic.patch
     Date:  28 Feb 2011, 22:44
     Size:  5728 bytes.
     Type:  Unknown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-amd64-kernel-pic.patch
Type: application/octet-stream
Size: 5728 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20110228/366ed2f9/attachment-0002.obj>


More information about the cfe-dev mailing list