[cfe-dev] [StaticAnalysis] Determine dereference values
Rafael·Stahl via cfe-dev
cfe-dev at lists.llvm.org
Fri Jul 28 07:43:02 PDT 2017
Thank you for the reply. I will check those out.
The FixedAddress checker is definitely a good start too, but we are more
interested in the actual dereferences and compare the deduced SVals to
known address ranges.
What I meant is that for example in the following it is expected to
point out a null dereference bug:
int main()
{
int* p = (int*)sizeof(int);
p -= 1;
return *p;
}
I just finished debugging the issue and found the implementation of
pointer arithmetic in SimpleSValBuilder::evalBinOpLN was missing some
logic. The "Multiplicand" was not initialized and therefore always zero.
The following fix was doing it for me. Should I copy this to the commits
list or can someone take a look at it here?
diff --git a/SimpleSValBuilder.cpp b/SimpleSValBuilder_fix.cpp
index f09f969..da31fc0 100644
--- a/SimpleSValBuilder.cpp
+++ b/SimpleSValBuilder_fix.cpp
@@ -927,6 +927,8 @@ SVal SimpleSValBuilder::evalBinOpLN(ProgramStateRef
state,
// Offset the increment by the pointer size.
llvm::APSInt Multiplicand(rightI.getBitWidth(), /* isUnsigned */
true);
+ QualType PteeTy =
resultTy.getTypePtr()->castAs<PointerType>()->getPointeeType();
+ Multiplicand = getContext().getTypeSizeInChars(PteeTy).getQuantity();
rightI *= Multiplicand;
// Compute the adjusted pointer.
On 28/07/17 16:08, Artem Dergachev wrote:
> First of all, there's already an experimental alpha.core.FixedAddress
> that seems to be doing what you want, in a relatively easy manner. I
> cannot guarantee that it actually works well, but it did seem to work
> somehow last time i looked at it.
>
> Regarding execution path coverage:
>
> - In general, yeah, the engine does not guarantee it wouldn't give up
> when it sees a lot of loops or other execution path splits. Analysis
> across translation units is being worked on in
> https://reviews.llvm.org/D30691 but at the cost of even more giving
> up, so it'd show up cross-module issues at the cost of randomly hiding
> intra-module issues that would otherwise be seen (so if you're all
> about coverage, you'd probably want to use both analyses in two runs).
>
> - If you expect many loops with relatively small fixed numbers of
> iterations, have a look at the ongoing GSoC project which, in
> particular, introduces an option to unroll such loops completely even
> if otherwise the analyzer would have given up
> (https://reviews.llvm.org/D34260 is already available in master; see
> other patches by Peter as well); other tweaks to give up in a less
> fatal manner when the loop is infinite ("loop widening") are also
> planned.
>
> - There are many existing options to control the "giving up" behavior
> under -analyzer-config, you may have to read AnalyzerOptions.cpp to
> learn them, they aren't very well-documented, unfortunately. They
> control loops and inter-procedural analysis.
>
> > I noticed the engine does not take the value of a file scoped
> constant pointer "T* const" into account.
>
> Reproduced. Unimplemented, i guess, but shouldn't be hard.
>
> > It seems to me that symbolic values of Locs are not fully tracked.
> Is this true and is there a way to fully track them?
>
> I don't think i fully understand, could you give an example?
>
>
> On 7/27/17 6:51 PM, Rafael·Stahl via cfe-dev wrote:
>> Hello
>>
>> We are looking into using the clang front-end for static analysis.
>>
>> The goal is to find memory accesses on the source code level whose
>> addresses can be statically determined or constrained. This should
>> work across functions and even translation units.
>>
>> Example:
>> main.c:
>> int main() {
>> for (int i = 0; i < 4; i++)
>> access(((int*)0x1234) + i); // pass 0x1234, 0x1238, 0x123c,
>> 0x1240
>> access(*(int**)0x4444); // pass statically unknown value
>> }
>> other.c:
>> void access(int* p) {
>> // Want output: read at addr
>> (0x1634|0x1638|0x163c|0x1640|unknown) from clang::Expr*.
>> ((volatile int*)p)[0x100];
>> }
>>
>> The clang StaticAnalysis library does a lot of the work we are
>> interested in. That is, determining what values an expression is
>> constrained to, while understanding stores, loads and running a
>> symbolic execution engine.
>>
>> How scalable is this approach? Even though we would require inter-TU
>> analysis, the problem could be reduced by only looking at accesses
>> that have the volatile qualifier since we are looking at hardware
>> accesses of a bare-metal program. Some retries without inlining are
>> fine, because we assume the accesses are not separated by the
>> constant with significant complexity in between.
>>
>> Will this be decently reliable? We are interested in cases where a
>> constant is dragged across a couple of low bounded loops with a bit
>> of arithmetic. What are typical cases where the engine gives up
>> because of exploding complexity? I have found that loops are explored
>> in a very limited scope. Is there an easy way to relax these limits a
>> bit at the cost of much higher execution time?
>>
>> I noticed the engine does not take the value of a file scoped
>> constant pointer "T* const" into account. Is there a technical
>> limitation that prevents doing this?
>>
>> I also tried to hack a bit on the DereferenceChecker and
>> DivZeroChecker to try and get the symbolic or even concrete value of
>> a Loc, but only got the initialized value and not the value it should
>> be at the dereference. When plotting a graph from a source that does
>> basic arithmetic on a pointer, the expression value never changes. It
>> seems to me that symbolic values of Locs are not fully tracked. Is
>> this true and is there a way to fully track them?
>>
>> A backwards data-flow analysis on IR level is probably a more
>> reasonable approach in general, but getting the exact clang::Expr
>> that does the access is valuable to us.
>>
>> Overall, is this problem reasonably solvable with clang static
>> analysis? Any feedback is greatly appreciated!
>>
>> Best Regards
>> Rafael
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
More information about the cfe-dev
mailing list