[cfe-dev] Miscellaneous Clang Static Analyzer Questions

Wed Jan 9 10:03:45 PST 2019

On 1/7/19 2:48 AM, Alexey Sidorin via cfe-dev wrote:
> Hello Gianluca,
> I'll try to answer some of your questions.
>
> 05.01.2019 5:52, Gianluca Gross via cfe-dev пишет:
>> Hello all,
>>
>> I am a student at the University of Pennsylvania, and my senior 
>> design team is looking to use the Clang Static Analyzer as the basis 
>> for our code quality app project. We have a few questions about the 
>> analyzer if you don't mind. Apologies if any of these questions are 
>> unclear (I could certainly try to clarify in a future email), but we 
>> would really appreciate it if anyone could provide answers or point 
>> us to resources that might be helpful.
>>
>> First off, we are curious how scan-build interacts with build 
>> processes other than Makefiles. For example, we are considering 
>> integrating our app with the Travis CI build tool, and we were 
>> wondering if there might be any resources or previous examples that 
>> might help us better understand how scan-build interacts with other 
>> build commands. I understand that this is a vague request, but we 
>> would definitely appreciate any help!
>
> Python-based scan-build (located in scan-build-py dir) uses libear to 
> intercept the processes being started. Internally, it uses LD_PRELOAD 
> mechanism to load the interception library, handle exec* calls and 
> filter out compiler processes. Then, it dumps the command lines into a 
> JSON compilation database.
>

Yup, and generally because the default scan-build can also operate by 
replacing CC= and CXX= environment variables, it automatically works 
with environments that respect those.

I've never tried doing it with Travis CI specifically, but it might work 
that way too. Or in the worst case, you should be able to manually 
override CC= and CXX= in your yaml config with ccc-analyzer/c++-analyzer 
binaries, which is kinda equivalent to what scan-build is doing.

>
>>
>> We are also curious about the differences between the "scan-build" 
>> and "clang-check" commands for running the analyzer. It seems that 
>> scan-build is the preferred way to run the analyzer, but is 
>> clang-check also worth considering? The clang-check command seems 
>> useful because it allows you to simply run the analyzer on source 
>> code rather than requiring a build command. Is there any way to make 
>> the clang-check command output the same HTML output files that are 
>> produced by scan-build?
>
> clang-check doesn't look like a good choice to me. Generally, build 
> command is almost always required because it affects the way how AST 
> is built. I've never heard any success stories about clang-check usage 
> with CSA.
>

Wanted to mention that clang-check is capable of using compilation 
databases, like scan-build-py, but it cannot gather them by intercepting 
arbitrary build systems. So if you have the JSON compilation database 
already (eg., provided by your build system), you technically use 
clang-check to run both the analyzer and clang-tidy. It might even be 
possible to make it produce html or plist output by adding something like

   -extra-arg=-Xclang -extra-arg=-analyzer-config -extra-arg=-Xclang 
-extra-arg=analyzer-output=html -extra-arg=-o -extra-arg=/your/output/dir

(i know, right?) to your `clang-check -analyze --` command. But i've 
never tried that. And it definitely won't build an index.html page for you.

>
>>
>> Also, for our project, we are looking to parse the "logical traces" 
>> that the analyzer outputs to report bug alarms (e.g. assuming 
>> variable is NULL, taking true branch, etc.). Would it be most 
>> reliable to parse the plist/XML files which are output by scan-build? 
>> Since our project will rely heavily on this information, we would 
>> like to make sure that our app is not too sensitive to future changes 
>> in the static analyzer. Is there some kind of fundamental underlying 
>> schema in which this logical information is stored, or will the XML 
>> files always remain consistent in the future?
>
> Yes, plists are the best choice for automatic parsing.

And they're also more stable, indeed. We're adding new features from 
time to time, but we haven't changed the format in a 
non-backwards-compatible way in years. Backwards compatibility here 
means that we may add new keys to plist dictionaries (and convey extra 
information that way), but it won't prevent you from looking up the old 
information by accessing the old keys, so you're safe as long as you're 
using a full-featured and reliable plist parser library.

>
>
>>
>> We also discovered that scan-build has an experimental Cross 
>> Translation Unit analysis, which seems like it could be great for our 
>> project. Is there any way we could follow future plans/updates and 
>> learn more information about the CTU?
>
> Some basic information about how CTU works can be found in this talk: 
> https://youtu.be/jbLkZ82mYE4?t=972. Basically, we use AST merging 
> mechanism to import the function definitions unavailable in the 
> translation unit being analyzed. We (especially Ericsson folks) work 
> on the improvement of the CTU quality and stability. You can track the 
> patches on Phabricator (reviews.llvm.org) by "ASTImporter" and "CTU" 
> tags.
>
>
>>
>> Apologies for the long list of questions. Any and all help would be 
>> appreciated, and thank you in advance!
>
> You're welcome.
>
>
>>
>> Best,
>> Gianluca
>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev