[cfe-dev] checkBind: distinguish between MemRegionVal/ElementRegion

Aitor San Juan aitor.sj at opendeusto.es
Sun May 18 09:11:18 PDT 2014


Hello,

Having reread the docs, I have several questions.

1) During the presentation you say that the order in which checker
callbacks happen is not guaranteed by the analyzer as it explores the CFG.
As far as I know, for example, PreCall will always be called before a
PostCall event, and PreStmt always before a PostStmt. I don't really
understand what you were referring to.

2) Should a checker be interested only in the parameter being passed during
a function call, I guess it wouldn't make any difference whether checking
the parameter in a PreCall event or in a PostCall event, would it? However,
in this case, is it better to only register for the PreCall callback/event
because of performance reasons?

3) When tracking the use of values (variables) between callbacks, if
needed, checkers must use the ProgramState as a means of preserving custom
information. This is clear. It's best to refer to those values by the
underlying symbol (symbolic representation) created by the analyzer. In my
case, I want to track the use of pointers to char (variables of type char
*). In this case, the 1st argumento to checkBind callback will be a
MemRegionVal. Reading the documentation, the counterpart of a symbol
(SymbolRef in terms of the API) with regard to MemRegions is a
SymbolicRegion.

Let's consider the following:
char *s = "string literal";
char pwd[] = "password";
char *p;
*p = s; (1)*
*p = pwd; (2)*

To be able to track the use of "p" --as in (1) and (2) above-- I was
thinking of obtaining a symbol that represents that variable (memory
region) and save that symbol in the ProgramState in case there's a future
reference to it in the program being analyzed, similar to the idea in the
sample SimpleStreamChecker. Why in this case "p" is not a symbolic region?
Reading the API, I thought the best way would be: getSymbolicBase() to be
able to call getSymbol() on the result, but the former returns NULL. So,
did I misunderstood and MemRegion is just the counterpart of a SymbolRef?
If so, what would be the best way of saving a MemRegion's symbolic
representation in the ProgramState?

I'm confused. Any hint or suggestion would be highly appreciated
Many thanks.

2014-05-08 19:57 GMT+02:00 Aitor San Juan <aitor.sj at opendeusto.es>:

> Watching that presentation you mention was one of the 1st things I did
> some time ago. I think I'll watch it again to refresh.
>
> I'll reread again the docs with your comments in mind.
>
> Thanks for the clarifications, Jordan.
>
>
> 2014-05-08 6:02 GMT+02:00 Jordan Rose <jordan_rose at apple.com>:
>
> Hello, Aitor. I'm afraid you're still getting SVals, symbols, and
>> MemRegions somewhat mixed up. They are not interchangeable. Have you
>> watched our presentation on writing a checker yet? (Linked here:
>> http://clang-analyzer.llvm.org/checker_dev_manual.html) I'm sorry it's
>> not really incorporated into the rest of the Checker Development Manual,
>> but the video is probably still the clearest introduction to analyzer core
>> concepts that we have.
>>
>>
>> 1) To test if Loc is a MemRegionVal I use the following, but there's
>>> something wrong I can't figure out (it doesn't compile), and I'm stuck (as
>>> far as I know, MemRegionVal is a subclass of SVal):
>>>
>>> if (clang::isa<loc::MemRegionVal>(Loc)) ...
>>>
>>
>> This is a bit mundane—you can only use isa<> on pointers and references,
>> but SVals are passed around by value. As you discovered, you can use getAs.
>>
>> SymbolRef sym = L->getAsLocSymbol();
>> SymbolRef sym = VLoc.getAsLocSymbol();
>> SymbolRef sym = VLoc.getAsSymbol();
>>
>>
>> The second one will handle everything the first one handles, as well as
>> locations cast to integer values (like "(intptr_t)&x"). The last one will
>> also give you back symbols for non-location values. But not all memory
>> regions are based on symbols (a local variable does not need a symbol), and
>> of course not all symbolic values are memory regions (the result of
>> random() is an integer).
>>
>>
>> 2) ElementRegion doesn't belong to the SVal class hierarchy. How can I
>>> know if Loc is an ElementRegin?
>>>
>>
>> That's not really a good question. What you really want to know is if a
>> given location is within a constant string region. That's a much simpler
>> question.
>>
>> // Does this value represent the address of a region?
>> const MemRegion *MR = V.getAsRegion();
>> if (!MR)
>>   return;
>>
>> bool isString = isa<StringRegion>(MR->getBaseRegion());
>>
>> This isn't going to cover *all* use cases, but it does cover this one
>> much more nicely than trying to pattern-match on ElementRegion.
>>
>> (Finally, of course, -fconst-strings is a much safer way to handle this
>> kind of issue, but that doesn't help if you have an existing codebase.)
>>
>> Jordan
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140518/fea237a9/attachment.html>


More information about the cfe-dev mailing list