[cfe-commits] implicit sign/bitwidth conversions during array indexing?

Ted Kremenek kremenek at apple.com
Thu Nov 13 19:41:54 PST 2008





On Nov 13, 2008, at 6:43 PM, Zhongxing Xu <xuzhongxing at gmail.com> wrote:

>
>
> On Fri, Nov 14, 2008 at 10:31 AM, Ted Kremenek <kremenek at apple.com>  
> wrote:
>
> On Nov 13, 2008, at 6:03 PM, Zhongxing Xu wrote:
>
>> The standard does not specify any information about this  
>> conversion. The compiler interprets E1[E2] as *(E1+E2).
>
> That's right.  That's what the C standard says too. (section 6.5.2.1).
>
>> The sign does not affect the way that the machine does 'add' (or  
>> 'sub'). (The sign only affects some operations, c.f. LLVM  
>> instructions)
>
>
> I'm skeptical that the choice of signed-ness or bitwidth is  
> arbitrary when handling E2, but I could be wrong.  Since the  
> standard says that E1[E2] is the same as *(E1 + E2) than we probably  
> need to perform any implicit type conversions that would be done by  
> Sema if the expression was literally written that way.  The other  
> option is to look at what compiler does (llvm-gcc for example).
>
> For example:
>
> void f(int *p) {
>   short i = 0;
>   unsigned short i_u = 0;
>   int j = 0;
>   unsigned j_u = 0;
>   long long k = 0;
>
>   int x;
>   x = *(p + i);
>   x += *(p + j);
>   x += *(p + i_u);
>   x += *(p + j_u);
>   x += *(p + k);
>
>   return x;
> }
>
> The -ast-dump (without the DeclStmts) is:
>
>   (BinaryOperator 0x21088f0 <line:9:3, col:14> 'int' '='
>     (DeclRefExpr 0x2108810 <col:3> 'int' Var='x' 0x21087c0)
>     (UnaryOperator 0x21088d0 <col:7, col:14> 'int' prefix '*'
>       (ParenExpr 0x21088b0 <col:8, col:14> 'int *'
>         (BinaryOperator 0x2108890 <col:9, col:13> 'int *' '+'
>           (DeclRefExpr 0x2108830 <col:9> 'int *' ParmVar='p'  
> 0x21084a0)
>           (ImplicitCastExpr 0x2108870 <col:13> 'int'
>             (DeclRefExpr 0x2108850 <col:13> 'short' Var='i'  
> 0x21042e0))))))
>   (CompoundAssignOperator 0x21089d0 <line:10:3, col:15> 'int' '+='  
> ComputeTy='int'
>     (DeclRefExpr 0x2108910 <col:3> 'int' Var='x' 0x21087c0)
>     (UnaryOperator 0x21089b0 <col:8, col:15> 'int' prefix '*'
>       (ParenExpr 0x2108990 <col:9, col:15> 'int *'
>         (BinaryOperator 0x2108970 <col:10, col:14> 'int *' '+'
>           (DeclRefExpr 0x2108930 <col:10> 'int *' ParmVar='p'  
> 0x21084a0)
>           (DeclRefExpr 0x2108950 <col:14> 'int' Var='j' 0x2108630)))))
>   (CompoundAssignOperator 0x2108ad0 <line:11:3, col:17> 'int' '+='  
> ComputeTy='int'
>     (DeclRefExpr 0x21089f0 <col:3> 'int' Var='x' 0x21087c0)
>     (UnaryOperator 0x2108ab0 <col:8, col:17> 'int' prefix '*'
>       (ParenExpr 0x2108a90 <col:9, col:17> 'int *'
>         (BinaryOperator 0x2108a70 <col:10, col:14> 'int *' '+'
>           (DeclRefExpr 0x2108a10 <col:10> 'int *' ParmVar='p'  
> 0x21084a0)
>           (ImplicitCastExpr 0x2108a50 <col:14> 'int'
>             (DeclRefExpr 0x2108a30 <col:14> 'unsigned short'  
> Var='i_u' 0x21085a0))))))
>   (CompoundAssignOperator 0x2108bb0 <line:12:3, col:17> 'int' '+='  
> ComputeTy='int'
>     (DeclRefExpr 0x2108af0 <col:3> 'int' Var='x' 0x21087c0)
>     (UnaryOperator 0x2108b90 <col:8, col:17> 'int' prefix '*'
>       (ParenExpr 0x2108b70 <col:9, col:17> 'int *'
>         (BinaryOperator 0x2108b50 <col:10, col:14> 'int *' '+'
>           (DeclRefExpr 0x2108b10 <col:10> 'int *' ParmVar='p'  
> 0x21084a0)
>           (DeclRefExpr 0x2108b30 <col:14> 'unsigned int' Var='j_u'  
> 0x21086a0)))))
>   (CompoundAssignOperator 0x2108c90 <line:13:3, col:15> 'int' '+='  
> ComputeTy='int'
>     (DeclRefExpr 0x2108bd0 <col:3> 'int' Var='x' 0x21087c0)
>     (UnaryOperator 0x2108c70 <col:8, col:15> 'int' prefix '*'
>       (ParenExpr 0x2108c50 <col:9, col:15> 'int *'
>         (BinaryOperator 0x2108c30 <col:10, col:14> 'int *' '+'
>           (DeclRefExpr 0x2108bf0 <col:10> 'int *' ParmVar='p'  
> 0x21084a0)
>           (DeclRefExpr 0x2108c10 <col:14> 'long long' Var='k'  
> 0x2108730)))))
>
> It appears that a promotion and sign change is done for 'short' and  
> 'unsigned short' to int, but there are no conversions otherwise.  Is  
> this correct?  Surely the compiler does some kind of promotion/ 
> truncation when doing pointer arithmetic.
>
> Is this the rule:
>  - if the bitwidth of E2 is the same as the pointer, do the  
> arithmetic.
>  - if the bitwidth of E2 is different from the pointer, trunc or ext  
> it to the same width of the pointer. Signed-ness affects the ext  
> operation. Then do the arithmetic.

I'm not certain.  Note that the 'long long' value 'k' was not  
truncated.  Is this a Sema bug, or is this the correct behavior?  For  
this target LongLongWidth is 64, the bit width for 'int' is 32, and  
the bit width for a pointer is (I believe) 32 bits as well.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20081113/2a9e734d/attachment.html>


More information about the cfe-commits mailing list