<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Mar 23, 2017, at 11:48 AM, Reid Kleckner via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote">On Thu, Mar 23, 2017 at 10:55 AM, David Blaikie <span dir="ltr" class=""><<a href="mailto:dblaikie@gmail.com" target="_blank" class="">dblaikie@gmail.com</a>></span> wrote:<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class=""><div class="gmail_quote"><span class=""><div dir="ltr" class="">On Thu, Mar 23, 2017 at 10:45 AM Reid Kleckner <<a href="mailto:rnk@google.com" target="_blank" class="">rnk@google.com</a>> wrote:</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class="m_4816353318239515335gmail_msg"><div class="gmail_extra m_4816353318239515335gmail_msg"><div class="gmail_quote m_4816353318239515335gmail_msg"><div class="m_4816353318239515335gmail_msg">I don't think it will be feasible to generalize UBSan's knowledge to the static analyzer.</div></div></div></div></blockquote></span><div class=""><br class="">Why not? The rough idea I meant would be to express the constraints UBSan is checking into the static analyzer - I realize the current layering (UBSan being in Clang's IRGen) doesn't make that trivial/obvious, but it seems to me that the constraints could be shared in some way - with some work.<br class=""></div></div></div></blockquote><div class=""><br class=""></div><div class="">Maybe I am not imaginative enough, but I cannot envision a clean way to express the conditions that trigger UB that is useful for both static analysis and dynamic instrumentation. The best I can come up with is AST-level instrumentation: synthesizing AST nodes that can be translated to IR or used for analysis. That doesn't seem reasonable, so I think getting ubsan into the static analyzer would end up duplicating the knowledge of what conditions trigger UB.</div><div class=""><br class=""></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class=""><div class="gmail_quote"><div class=""></div><span class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class="m_4816353318239515335gmail_msg"><div class="gmail_extra m_4816353318239515335gmail_msg"><div class="gmail_quote m_4816353318239515335gmail_msg"><div class="m_4816353318239515335gmail_msg"> The static analyzer CFG is also at best an approximation of the real CFG, especially for C++.</div></div></div></div></blockquote></span><div class=""><br class=""></div></div></div></blockquote></div></div></div></div></blockquote><div><br class=""></div><div>First, the CFG is not only used by the analyzer, but it is also used by Sema (ex: unreachable code warnings and uninitialized variables).</div><div>Second, while there are corners of C++ that are not supported, it has high fidelity otherwise.</div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class=""><div class="gmail_quote"><div class="">Sure enough - and I believe some of the people working/caring about it would like to fix that. I think Manuel & Chandler have expressed the notion that the best way to do that would be to move to a world where the CFG is used for CodeGen, so it's a single/consistent source of truth.<br class=""></div></div></div></blockquote><div class=""><br class=""></div><div class="">Yes, we could do all that work, but LLVM's CFG today is already precise for C++. If we allow ourselves to emit diagnostics from the middle-end, we can save all that work.</div><div class=""><br class=""></div><div class="">Going down the high-effort path of extending the CFG and abstracting or duplicating UBSan’s checks as static analyses on that CFG would definitely provide a better diagnostic experience, but it's worth re-examining conventional wisdom and exploring alternatives first.</div></div></div></div></div></blockquote><div><br class=""></div>The idea of analysis based on top of LLVM IR is not new and have been discussed before. My personal belief is that having access to the AST (or just code as was written by the user) is very important. It ensures we can provide precise diagnostics. It also allows us to see when users want to suppress an issue report by changing the way the source code is uttered. For example, allows to tell the developer that they can suppress with a cast. We can also differentiate between “NULL” and “0”, which allows us to determine if the programmer intended to use a pointer constant or a zero numeric value.</div><div><br class=""></div><div>Ensuring that clang CFG completely supports C++ is a bit challenging but not insurmountable task. In addition, fixing-up the unsupported corners in the CFG would benefit not only the clang static analyzer, but all the other “users” such as clang warnings, clang-tidy, possibly even refactoring in the future.</div><div><br class=""></div><div>Here is a thread where this has been discussed before:</div><div><a href="http://clang-developers.42468.n3.nabble.com/LLVM-Dev-meeting-Slides-amp-Minutes-from-the-Static-Analyzer-BoF-td4048418.html" class="">http://clang-developers.42468.n3.nabble.com/LLVM-Dev-meeting-Slides-amp-Minutes-from-the-Static-Analyzer-BoF-td4048418.html</a></div><div><br class=""><blockquote type="cite" class=""><div class="">

_______________________________________________<br class="">cfe-dev mailing list<br class=""><a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a><br class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev<br class=""></div></blockquote></div><br class=""></body></html>