[cfe-dev] [StaticAnalysis] Determine dereference values

Rafael·Stahl via cfe-dev cfe-dev at lists.llvm.org
Fri Jul 28 07:43:02 PDT 2017


Thank you for the reply. I will check those out.

The FixedAddress checker is definitely a good start too, but we are more 
interested in the actual dereferences and compare the deduced SVals to 
known address ranges.

What I meant is that for example in the following it is expected to 
point out a null dereference bug:

int main()
{
     int* p = (int*)sizeof(int);
     p -= 1;
     return *p;
}

I just finished debugging the issue and found the implementation of 
pointer arithmetic in SimpleSValBuilder::evalBinOpLN was missing some 
logic. The "Multiplicand" was not initialized and therefore always zero. 
The following fix was doing it for me. Should I copy this to the commits 
list or can someone take a look at it here?

diff --git a/SimpleSValBuilder.cpp b/SimpleSValBuilder_fix.cpp
index f09f969..da31fc0 100644
--- a/SimpleSValBuilder.cpp
+++ b/SimpleSValBuilder_fix.cpp
@@ -927,6 +927,8 @@ SVal SimpleSValBuilder::evalBinOpLN(ProgramStateRef 
state,

        // Offset the increment by the pointer size.
        llvm::APSInt Multiplicand(rightI.getBitWidth(), /* isUnsigned */ 
true);
+      QualType PteeTy = 
resultTy.getTypePtr()->castAs<PointerType>()->getPointeeType();
+      Multiplicand = getContext().getTypeSizeInChars(PteeTy).getQuantity();
        rightI *= Multiplicand;

        // Compute the adjusted pointer.


On 28/07/17 16:08, Artem Dergachev wrote:
> First of all, there's already an experimental alpha.core.FixedAddress 
> that seems to be doing what you want, in a relatively easy manner. I 
> cannot guarantee that it actually works well, but it did seem to work 
> somehow last time i looked at it.
>
> Regarding execution path coverage:
>
> - In general, yeah, the engine does not guarantee it wouldn't give up 
> when it sees a lot of loops or other execution path splits. Analysis 
> across translation units is being worked on in 
> https://reviews.llvm.org/D30691 but at the cost of even more giving 
> up, so it'd show up cross-module issues at the cost of randomly hiding 
> intra-module issues that would otherwise be seen (so if you're all 
> about coverage, you'd probably want to use both analyses in two runs).
>
> - If you expect many loops with relatively small fixed numbers of 
> iterations, have a look at the ongoing GSoC project which, in 
> particular, introduces an option to unroll such loops completely even 
> if otherwise the analyzer would have given up 
> (https://reviews.llvm.org/D34260 is already available in master; see 
> other patches by Peter as well); other tweaks to give up in a less 
> fatal manner when the loop is infinite ("loop widening") are also 
> planned.
>
> - There are many existing options to control the "giving up" behavior 
> under -analyzer-config, you may have to read AnalyzerOptions.cpp to 
> learn them, they aren't very well-documented, unfortunately. They 
> control loops and inter-procedural analysis.
>
> > I noticed the engine does not take the value of a file scoped 
> constant pointer "T* const" into account.
>
> Reproduced. Unimplemented, i guess, but shouldn't be hard.
>
> > It seems to me that symbolic values of Locs are not fully tracked. 
> Is this true and is there a way to fully track them?
>
> I don't think i fully understand, could you give an example?
>
>
> On 7/27/17 6:51 PM, Rafael·Stahl via cfe-dev wrote:
>> Hello
>>
>> We are looking into using the clang front-end for static analysis.
>>
>> The goal is to find memory accesses on the source code level whose 
>> addresses can be statically determined or constrained. This should 
>> work across functions and even translation units.
>>
>> Example:
>> main.c:
>>     int main() {
>>       for (int i = 0; i < 4; i++)
>>         access(((int*)0x1234) + i);  // pass 0x1234, 0x1238, 0x123c, 
>> 0x1240
>>       access(*(int**)0x4444);  // pass statically unknown value
>>     }
>> other.c:
>>     void access(int* p) {
>>       // Want output: read at addr 
>> (0x1634|0x1638|0x163c|0x1640|unknown) from clang::Expr*.
>>       ((volatile int*)p)[0x100];
>>     }
>>
>> The clang StaticAnalysis library does a lot of the work we are 
>> interested in. That is, determining what values an expression is 
>> constrained to, while understanding stores, loads and running a 
>> symbolic execution engine.
>>
>> How scalable is this approach? Even though we would require inter-TU 
>> analysis, the problem could be reduced by only looking at accesses 
>> that have the volatile qualifier since we are looking at hardware 
>> accesses of a bare-metal program. Some retries without inlining are 
>> fine, because we assume the accesses are not separated by the 
>> constant with significant complexity in between.
>>
>> Will this be decently reliable? We are interested in cases where a 
>> constant is dragged across a couple of low bounded loops with a bit 
>> of arithmetic. What are typical cases where the engine gives up 
>> because of exploding complexity? I have found that loops are explored 
>> in a very limited scope. Is there an easy way to relax these limits a 
>> bit at the cost of much higher execution time?
>>
>> I noticed the engine does not take the value of a file scoped 
>> constant pointer "T* const" into account. Is there a technical 
>> limitation that prevents doing this?
>>
>> I also tried to hack a bit on the DereferenceChecker and 
>> DivZeroChecker to try and get the symbolic or even concrete value of 
>> a Loc, but only got the initialized value and not the value it should 
>> be at the dereference. When plotting a graph from a source that does 
>> basic arithmetic on a pointer, the expression value never changes. It 
>> seems to me that symbolic values of Locs are not fully tracked. Is 
>> this true and is there a way to fully track them?
>>
>> A backwards data-flow analysis on IR level is probably a more 
>> reasonable approach in general, but getting the exact clang::Expr 
>> that does the access is valuable to us.
>>
>> Overall, is this problem reasonably solvable with clang static 
>> analysis? Any feedback is greatly appreciated!
>>
>> Best Regards
>> Rafael
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>




More information about the cfe-dev mailing list