<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Thanks Ted for your insight, should be fairly straightforward for me now.<div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Dec 31, 2014, at 12:01 AM, Ted Kremenek <<a href="mailto:kremenek@apple.com" class="">kremenek@apple.com</a>> wrote:There's two design points at play here that I think are worth noting:</div></blockquote><div><blockquote type="cite" class=""><div class=""><br class=""></div><div class="">(1) This interposition is not a great hack. Sometimes the build system doesn't control every compiler innovation. For example, a Makefile build could spin off a shell script that invokes the compiler, and that script may not consult CC or CXX. Also, some Makefiles (or whatever) hardwire the compiler they use, thus setting CC or CXX has no effect. Unless the build system has total control of the compilation jobs, the only sure way to interpose on all compiler instances is to do true OS interposition. This can be accomplished in various platform-specific ways on different platforms, and is the approach taken by several commercial static analysis tools.</div><div class=""><br class=""></div></blockquote></div><div>This might be irrelevant but I have spent a *lot* of time wrestling makefile-based systems over the years (~25 years) both my own and others. I personally would consider a Makefile that hardwires CC or CXX to be a broken Makefile. I would not let that influence my thinking on how I’d implement the analyzer. The business of running another script from a Makefile is one of the reasons I'm not using make anymore, while sometimes necessary it just a path that leads to a rats nest and so I would also disregard that when thinking about making an analyzer.</div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">(2) The way we do analysis and gather reports could hypothetically change in the future. Right now we analyze each file at a time, generate reports, and then stitch the reports together. In the future, a reasonable direction for the analyzer would be do two phases of analysis: first do a pass over the files like we do now, and then do a more global analysis to find bugs that cross file boundaries. This approach would inherently defer the generation of reports to the very end. Also, the mechanism by which reports are generated could change. Right now HTML files are generated eagerly, but those really are a static rendering. A more versatile approach would be to generate output that summarizes the bug reports, and have the HTML reports (or whatever) generated from those.</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div></blockquote><br class=""><div>A two phase system sounds like a huge win if it gets global analysis. As far as the interface between the build system and the analyzer goes, for this I would think it will come down to how to pass the state from each phase. You could require the build system pass in some identifier to the analyzer for each pass. The identifier could then be turned into a directory name or db key to do whatever it needed to communicate between passes. Feels like a natural thing for a build system to do (generate a build id). I’m sure when I go look in the scan-build script there’s going to be some “look in /tmp for a set of files named X” type arrangement, this would just be formalizing that idea. </div><div><br class=""></div><div>Thanks again for your help,</div><div><br class=""></div><div>- James</div><div><br class=""></div></div></div></body></html>