[LLVMdev] Is this a bug in clang?

Ahmed Charles ahmedcharles at gmail.com
Tue Apr 19 19:25:20 PDT 2011


This code is undefined, meaning that all bets are off, don't do it.
I.e. It reads the value of I between two sequence points and uses it
for something other than determining the value written. From: Csaba
Raduly
Sent: Tuesday, April 19, 2011 3:44 AM
To: Joe Armstrong
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Is this a bug in clang?
Hi Joe

On Tue, Apr 19, 2011 at 10:59 AM, Joe Armstrong <joearms at gmail.com> wrote:
> Hello,
>
> Is this a bug in clang, or a bug in my thinking?
>
> /Joe Armstrong
>
>
>
> When I compile the following program I get different answers in clang and gcc.
>
> int printf(const char * format, ...);

Nitpick: you shouldn't do that. Now you have a potential mismatch
between whet you compile and what you link to.

>
> int main()
> {
>  int i, j;
>  i = 10;
>  j = i++ + 20 + i;
>  printf("j = %d\n",j);
>  return(0);
> }

>
>  $ gcc bug2.c
>  $ ./a.out
>  j = 40
>  $ clang bug2.c
>  $ ./a.out
>  j = 41
>
>  I think the correct answer is 41. If my understanding of C is correct
>  (which, or course, it might not be) the incremented value of i++ is
>  first made available at the next sequence point, which is after the
>  ';' at the end of the offending statement statement.
>

There is no such guarantee. From the C99 standard,

5.1.2.3 Program execution:
} 2 ... At certain specified points in the execution sequence called
sequence points, all side effects
} of previous evaluations shall be complete and no side effects of
subsequent evaluations
} shall have taken place.

The side effect is made available _no_later_ than the sequence point.
There is no guarantee that it would not be made available earlier.

An example from the standard:

#include <stdio.h>
int sum;
char *p;
/* ... */
sum = sum * 10 - ’0’ + (*p++ = getchar());

the expression statement is grouped as if it were written as

sum = (((sum * 10) - ’0’) + ((*(p++)) = (getchar())));

but the actual increment of p can occur at any time between the
previous sequence point and the next
sequence point (the ;), and the call to getchar can occur at any point
prior to the need of its returned
value.
/quote

Also,

6.5 Expressions
} 2 Between the previous and next sequence point an object shall have
its stored value
}    modified at most once by the evaluation of an expression.
Furthermore, the prior value
}    shall be accessed only to determine the value to be stored.60)

Footnote 60:
} This paragraph renders undefined statement expressions such as
}  i = ++i + 1;
}  a[i++] = i;
} while allowing
}  i = i + 1;
}  a[i] = i;


If you are lucky, the behavior of your program is unspecified (each
implementation is required to behave consistently with itself, which
is why GCC and clang are allowed to produce different results).

If not, you invoked the dreaded undefined behavior.

Csaba
-- 
GCS a+ e++ d- C++ ULS$ L+$ !E- W++ P+++$ w++$ tv+ b++ DI D++ 5++
The Tao of math: The numbers you can count are not the real numbers.
Life is complex, with real and imaginary parts.
"Ok, it boots. Which means it must be bug-free and perfect. " -- Linus Torvalds
"People disagree with me. I just ignore them." -- Linus Torvalds

_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list