[cfe-dev] RFC: Adding attribute(nonnull) to things in libc++

Nick Lewycky nlewycky at google.com
Wed Jun 3 00:21:42 PDT 2015

On 1 June 2015 at 07:20, Marshall Clow <mclow.lists at gmail.com> wrote:

> This weekend, I got an email from Nuno Lopes informing me that UBSAN now
> paid attention to attribute(nonnull), and he was having some problems with
> it going off when using libc++.

FYI, I also looked into turning this on, but with libstdc++, and found that
they annotated basic_string<T>::assign(pointer, len) with attribute
nonnull. That's a problem, because it's valid to call
basic_string<T>::assign(nullptr, 0), but the reasoning why it's valid makes
me want to ask the committee whether this is what they intended.

The language std text claims that the pointer must point to an array of 'n'
(second argument) length, but earlier in the text it also states that in
the library, whenever it says "array" it means any pointer upon which
address computations and accesses to objects (that would be valid if the
pointer did point to the first element of such an array). Thus, nullptr is
valid if 'n' is zero.

This was changed in DR2235:
The text and discussion of DR2235 sound like they intend to make the
behaviour of assign match that of the constructor that takes the same
arguments. What they actually did was change the constructor to match the
behaviour of assign, and it doesn't look like removing the requirement of a
nonnull pointer was considered and intended.

At this point I made a note that somebody should ask the committee when
they get the chance, and never got back around to it.


I did some investigation, and found that he was exactly right - there were
> places (deep inside the vector code, for example) which called
> std::memcpy(null, null, 0) - which is definitely UB.
> In an ideal world, our C library would define ::memcpy with the correct
> annotations, and libc++ would import that into namespace std, and we'd be
> golden.
> But we don't have a C library - we use whatever is provided by the system
> we're running on, so that's not really an option.
> For my testing, I changed libc++'s <cstring> header:
> -using ::memcpy;
> +void* memcpy(void* __s1, const void* __s2, size_t __n)
> __attribute__((nonnull(1, 2)))
> +{ return ::memcpy(__s1, __s2, __n); }
> (similarly for memmove and memcmp), and I found several cases of simple
> code that now UBSAN fires off on:
> such as:    std::vector<int> v;  v.push_back(1);
> and :          int *p = NULL;  std::copy(p,p,p);
> This seems fairly useful to me.
> I would like to hear other people's opinions about:
> * Is adding this kind of UB detection something that people want in libc++?
> * What do people think about wrapping the C library functions to enable
> UBSAN to catch them (this is a separate Q from the first Q, because I can
> see putting these kind of parameter checks into functions that have no
> counterpart in the C library). Sadly, this would NOT affect calls to
> ::memcpy (for example), just std::memcpy.
> * Is that the best way to annotate the declarations? Is there a more
> portable, standard way to do this (things that start with double
> underscores worry me). In any case, I would probably wrap this in a macro
> to support compilers that don't understand whatever mechanism we end up
> using.
> Thanks
> -- Marshall
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150603/61c0b3ee/attachment.html>

More information about the cfe-dev mailing list