[cfe-dev] Static Analyzer and signed 64bit constraints

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Tue Jan 30 10:02:32 PST 2018


Hmm, this looks correct. Since typeof(sizeof(...)) is size_t, which is 
uint64_t on that architecture, and C promotion rules for comparison of 
int64_t and uint64_t would require conversion of both operands into 
uint64_t, all negative values of 'z' would turn out to be larger than 
20. This is different from comparison of, say, int16_t and uint16_t, 
which would be converted to int for the purposes of comparison.

Like, the analyzer doesn't always do a good job around integral casts, 
but this case looks correct.

On 29/01/2018 9:28 PM, - yrp wrote:
>
> Hi Artem,
>
> Thanks for the reply and info. After minimizing it appears this is related to sizeof + 64bit perhaps? I had to modify your testcase slightly to demonstrate the behavior Im seeing:
>
>
> #include <stdlib.h>
> #include <stdio.h>
>
> void foo(void) {
>    int64_t z = -1;
>    read(1, &z, sizeof(z));
>
>    char x[20] = { 0 };
>    char y[20] = { 0 };
>
>    // if (z > 20)
>    if (z > sizeof x)
>      exit(-1);
>    memcpy(x, y, z);
>    // triggers state dump via ExprInspection
>    clang_analyzer_printState();
>    // don't garbage-collect 'z' until now
>    z;
> }
>
>
> $ clang --analyze test.c -Xclang -analyzer-checker -Xclang debug.ExprInspection
> ...
>   conj_$0{int64_t} : { [1, 20] }
>
> If I change the type of 'z' to int32_t I get:
>
>   conj_$0{int32_t} : { [-2147483648, -1], [1, 20] }
>
> If I change the if guard on 'z' to the literal '20' and keep the type as int64_t I get the expected range:
>
>   conj_$0{int64_t} : { [-9223372036854775808, -1], [1, 20] }
>
>
> Have I misunderstood some C behavior regarding sizeof and this is expected? I'll feel pretty silly if this is the case...
>
> In case it's relevant: clang version 6.0.0 (trunk 318002) running on linux
>
> Cheers!
> yrp
>
>
>
>
> On Monday, January 29, 2018, 6:18:51 PM PST, Artem Dergachev <noqnoqneo at gmail.com> wrote:
>
>
>
>
>
>> The constraint on the above value is printed as 'conj_$0{int32_t} : {
> [-2147483648, -1], [1, 20] }'. What happened to zero? My best guess is
> the static analyzer realizing that 'memcpy(x, y, 0)' is a nop and
> pruning that program state?
>
> Yes, indeed, it is an effect of how we model 'memcpy'. I don't think
> it's the desired behavior though. In particular, we don't display a
> warning on 'memcpy(NULL, y, 0)', even though it's UB. So, yeah, sounds
> like either a minor bug. It's not obvious for me how often paths with
> zero-size memcpys are actually feasible in real-world programs.
>
>
>
>> However, if I switch this to an int64_t I get:
>> conj_$0{int64_t} : { [1, 20] }
> Hmm, weird, i'm not able to reproduce it. Could you provide more info?
> I'm trying:
>
>
> $ cat test.c
>
>
>    #include <stdlib.h>
>    #include <stdio.h>
>
>    void foo(void *x, void *y) {
>      int64_t z = -1;
>      read(1, &z, sizeof(z));
>      if (z > 20)
>        exit(-1);
>      memcpy(x, y, z);
>      // triggers state dump via ExprInspection
>      clang_analyzer_printState();
>      // don't garbage-collect 'z' until now
>      z;
>    }
>
>
> $ clang --analyze test.c -Xclang -analyzer-checker -Xclang
> debug.ExprInspection
>
>
> Store (direct and default bindings), 0x7fcf3114a5e8 :
>   (z,0,direct) : conj_$0{int64_t}
>   (GlobalInternalSpaceRegion,0,default) : conj_$1{int}
>   (GlobalSystemSpaceRegion,0,default) : conj_$2{int}
>
> Expressions:
>   (0x7fcf30579920,0x7fcf3113c0c0) clang_analyzer_printState :
> &code{clang_analyzer_printState}
>
> Ranges of symbol values:
>   conj_$0{int64_t} : { [-9223372036854775808, -1], [1, 20] }
>
>
>
> On 29/01/2018 12:25 AM, - yrp via cfe-dev wrote:
>> Hello,
>>
>> I'm attempting to write a checker to look for certain signedness
>> issues. I have code similar to the following:
>>
>> int32_t z = -1;
>> read (STDIN_FILENO, &z, sizeof(z));
>>
>> if (z > 20) exit (EXIT_FAILURE);
>>
>> memcpy (x, y, z);
>>
>> Inside the checker I am printing the constraints on the SVal 'z' with
>> a PreCall, filtering on memcpy:
>>
>> ProgramStateRef State = C.getState();
>> State->getConstraintManager().print(State, llvm::outs(), "\n", " ");
>>
>> I have two questions related to this:
>>
>> 1. The constraint on the above value is printed as 'conj_$0{int32_t} :
>> { [-2147483648, -1], [1, 20] }'. What happened to zero? My best guess
>> is the static analyzer realizing that 'memcpy(x, y, 0)' is a nop and
>> pruning that program state?
>>
>> 2. The above code, aside from the zero exception, makes sense when z
>> is an int8_t, int16_t, or an int32_t:
>>
>>   conj_$0{int8_t} : { [-128, -1], [1, 20] }
>>   conj_$0{int16_t} : { [-32768, -1], [1, 20] }
>>   conj_$0{int32_t} : { [-2147483648, -1], [1, 20] }
>>
>> However, if I switch this to an int64_t I get:
>>
>>   conj_$0{int64_t} : { [1, 20] }
>>
>> What happened to all the negative numbers?
>>
>> Apologies if I've missed something obvious.
>>
>> Cheers!
>> -yrp
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




More information about the cfe-dev mailing list