[LLVMdev] inline asm semantics: output constraint width smaller than input

Ingo Molnar mingo at elte.hu
Sat Jan 24 09:27:58 PST 2009


* Török Edwin <edwintorok at gmail.com> wrote:

> On 2009-01-23 20:27, Török Edwin wrote:
> >>>    
> >>>       
> >> i'd not mind it at all if the kernel could be built with other open-source 
> >> compilers too.
> >>
> >> Now in this case the patch you suggest might end up hurting the end result 
> >> so it's not an unconditional 'yes'. But ... how much it actually matters 
> >> depends on the circumstances.
> >>
> >> So could you please send a sample patch for some of most common inline 
> >> assembly statements that are affected by this, so that we can see:
> >>
> >>    1) how ugly the LLVM workarounds are
> >>   
> >>     
> >
> > Ok, I will prepare a patch for both cases.
> >
> >   
> >>    2) how they affect the generated kernel image in practice
> >>
> >> My gut feeling is that it's going to be acceptable with a bit of thinking 
> >> (we might even do some wrappers to do this cleanly) - but i'd really like 
> >> to see it before giving you that judgement.
> >>     
> 
> The below patch is to build the kernel for x86_64, with the attached
> .config,
> using llvm-gcc (trunk, with patch from
> http://llvm.org/bugs/show_bug.cgi?id=2989#c2).
> 
> The .config has KVM turned off, because I didn't know how to change
> x86_emulate.c so that LLVM builds it
> (http://llvm.org/bugs/show_bug.cgi?id=3373#c10)
> For 32-bit some more changes are required.
> 
> The resulting kernel image are of the same size
> $ ls -l vmlinux.patched
> -rwxr-xr-x 1 edwin edwin 11277206 2009-01-24 17:58 vmlinux.patched
> $ ls -l vmlinux
> -rwxr-xr-x 1 edwin edwin 11277206 2009-01-24 18:01 vmlinux
> 
> They aren't identical though, a disassembly shows that the address of
> most of functions changed,
> also some register assignments changed (r14 instead of r15, and so on).
> 
> Are these changes correct, and are they acceptable?
> 
> Best regards,
> --Edwin
> 
> ---
>  arch/x86/include/asm/uaccess.h |   10 ++++++----
>  arch/x86/lib/delay.c           |    2 +-
>  2 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
> index 69d2757..28280de 100644
> --- a/arch/x86/include/asm/uaccess.h
> +++ b/arch/x86/include/asm/uaccess.h
> @@ -154,7 +154,7 @@ extern int __get_user_bad(void);
>  
>  #define get_user(x, ptr)                        \
>  ({                                    \
> -    int __ret_gu;                            \
> +    unsigned long __ret_gu;                        \
>      unsigned long __val_gu;                        \
>      __chk_user_ptr(ptr);                        \
>      might_fault();                            \
> @@ -176,7 +176,7 @@ extern int __get_user_bad(void);
>          break;                            \
>      }                                \
>      (x) = (__typeof__(*(ptr)))__val_gu;                \
> -    __ret_gu;                            \
> +    (int)__ret_gu;                            \
>  })
>  
>  #define __put_user_x(size, x, ptr, __ret_pu)            \
> @@ -239,11 +239,13 @@ extern void __put_user_8(void);
>   */
>  #define put_user(x, ptr)                    \
>  ({                                \
> -    int __ret_pu;                        \
> +    __typeof__(*(ptr)) __ret_pu;                \

This does not look right. We can sometimes have put_user() of non-integer 
types (say structures). How does the (int)__ret_pu cast work in that case? 
We'll fall into this branch in that case:

        default:                                                \
                __put_user_x(X, __pu_val, ptr, __ret_pu);       \
                break;                                          \

and __ret_pu has a nonsensical type in that case.

>      __typeof__(*(ptr)) __pu_val;                \
>      __chk_user_ptr(ptr);                    \
>      might_fault();                        \
>      __pu_val = x;                        \
> +       /* return value is 0 or -EFAULT, both fit in 1 byte, and \
> +    * are sign-extendable to int */                \
>      switch (sizeof(*(ptr))) {                \
>      case 1:                            \
>          __put_user_x(1, __pu_val, ptr, __ret_pu);    \
> @@ -261,7 +263,7 @@ extern void __put_user_8(void);
>          __put_user_x(X, __pu_val, ptr, __ret_pu);    \
>          break;                        \
>      }                            \
> -    __ret_pu;                        \
> +    (int)__ret_pu;                        \
>  })
>  
>  #define __put_user_size(x, ptr, size, retval, errret)            \
> diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
> index f456860..12d27f8 100644
> --- a/arch/x86/lib/delay.c
> +++ b/arch/x86/lib/delay.c
> @@ -112,7 +112,7 @@ EXPORT_SYMBOL(__delay);
>  
>  inline void __const_udelay(unsigned long xloops)
>  {
> -    int d0;
> +    unsigned long d0;
>  
>      xloops *= 4;
>      asm("mull %%edx"

Is this all that you need (plus the 16-bit setup code tweaks) to get LLVM 
to successfully build a 64-bit kernel image?

If yes then this doesnt look all that bad or invasive at first sight (if 
the put_user() workaround can be expressed in a cleaner way), but in any 
case it would be nice to hear an LLVM person's opinion about roughly when 
this is going to be solved in LLVM itself.

	Ingo



More information about the llvm-dev mailing list