<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On 1 June 2015 at 07:20, Marshall Clow <span dir="ltr"><<a href="mailto:mclow.lists@gmail.com" target="_blank">mclow.lists@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">This weekend, I got an email from Nuno Lopes informing me that UBSAN now paid attention to attribute(nonnull), and he was having some problems with it going off when using libc++.</div></blockquote><div><br></div><div>FYI, I also looked into turning this on, but with libstdc++, and found that they annotated basic_string<T>::assign(pointer, len) with attribute nonnull. That's a problem, because it's valid to call basic_string<T>::assign(nullptr, 0), but the reasoning why it's valid makes me want to ask the committee whether this is what they intended.</div><div><br></div><div><span style="font-family:Arial,Helvetica,sans-serif;font-size:13px">The language std text claims that the pointer must point to an array of 'n' (second argument) length</span>, but earlier in the text it also states that in the library, whenever it says "array" it means any pointer upon which address computations and accesses to objects (that would be valid if the pointer did point to the first element of such an array). Thus, nullptr is valid if 'n' is zero.</div><div><br></div><div>This was changed in DR2235:<br><a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__cplusplus.github.io_LWG_lwg-2Ddefects.html-232235&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=CnzuN65ENJ1H9py9XLiRvC_UQz6u3oG6GUNn7_wosSM&m=QwH1ytfUTZjev-FxDSL0yAcJONZ2lxGVMLRccx3chN4&s=zilKOD4sAJgECSmOJ5w3PGrblDbcIReiziRCpVOOkw8&e=">http://cplusplus.github.io/LWG/lwg-defects.html#2235</a></div><div>The text and discussion of DR2235 sound like they intend to make the behaviour of assign match that of the constructor that takes the same arguments. What they actually did was change the constructor to match the behaviour of assign, and it doesn't look like removing the requirement of a nonnull pointer was considered and intended.<br></div><div><br></div><div>At this point I made a note that somebody should ask the committee when they get the chance, and never got back around to it.</div><div><br></div><div>Nick</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div>I did some investigation, and found that he was exactly right - there were places (deep inside the vector code, for example) which called std::memcpy(null, null, 0) - which is definitely UB.</div><div><br></div><div>In an ideal world, our C library would define ::memcpy with the correct annotations, and libc++ would import that into namespace std, and we'd be golden. </div><div><br></div><div>But we don't have a C library - we use whatever is provided by the system we're running on, so that's not really an option.</div><div><br></div><div>For my testing, I changed libc++'s <cstring> header:</div><div><br></div><div><div>-using ::memcpy;</div><div>+inline _LIBCPP_INLINE_VISIBILITY </div><div>+void* memcpy(void* __s1, const void* __s2, size_t __n) __attribute__((nonnull(1, 2)))</div><div>+{ return ::memcpy(__s1, __s2, __n); }</div></div><div><br></div><div>(similarly for memmove and memcmp), and I found several cases of simple code that now UBSAN fires off on:</div><div><br></div><div>such as: std::vector<int> v; v.push_back(1);</div><div>and : int *p = NULL; std::copy(p,p,p);</div><div><br></div><div>This seems fairly useful to me.</div><div><br></div><div>I would like to hear other people's opinions about:</div><div><br></div><div>* Is adding this kind of UB detection something that people want in libc++?</div><div><br></div><div>* What do people think about wrapping the C library functions to enable UBSAN to catch them (this is a separate Q from the first Q, because I can see putting these kind of parameter checks into functions that have no counterpart in the C library). Sadly, this would NOT affect calls to ::memcpy (for example), just std::memcpy.</div><div><br></div><div>* Is that the best way to annotate the declarations? Is there a more portable, standard way to do this (things that start with double underscores worry me). In any case, I would probably wrap this in a macro to support compilers that don't understand whatever mechanism we end up using.</div><div><br></div><div>Thanks</div><span class=""><font color="#888888"><div><br></div><div>-- Marshall</div><div><br></div></font></span></div>
<br>_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@cs.uiuc.edu">cfe-dev@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br>
<br></blockquote></div><br></div></div>