[cfe-dev] C++ Annex K safe C11 functions

JF Bastien via cfe-dev cfe-dev at lists.llvm.org
Fri Jan 4 13:50:21 PST 2019



> On Jan 4, 2019, at 1:47 PM, Jonny Grant <jg at jguk.org> wrote:
> 
> Hi! Sounds great
> How about setting uninitialised variables to 0xdeadbeef or 0xabbaabba so its easily identifiable when they crop up in use?
> We used to clear buffers to 0x11111111 and stack to 0x22222222 I recall

This isn’t relevant to the Annex K discussion, let’s keep this thread focused. We discussed initialization values in the original thread as well as the code review, it’s worth reading through that to see why I chose the values I did (mainly so pointers are invalid, and they’re repeated byte-values so the code generation is better).


> The URL should be llvm.org/r349442  BTW
> 
> Jonny
> 
> On 04/01/2019 20:37, JF Bastien wrote:
>> I think clang could offer builtins which provide some of the Annex K building blocks, and let libc implementations provide the rest (using the clang builtins when available).
>> 
>> I’m interested in implementing these builtins, unless someone beats me to it. User code often uses Annex K to provide guarantees that are now redundant with trivial automatic variable initialization (llvm.org/rL349442 <http://llvm.org/rL349442>), and I’d like to reduce the hit they’re taking. Here are some notes I wrote for myself a little while ago:
>> 
>> I’ll focus on memset, but this applies to other Annex K functionality (memcpy_s, memmove_s, strcpy_s, strncpy_s, strcat_s, strncat_s, strtok_s, memset_s, strerror_s, strerrorlen_s, strnlen_s).
>> 
>> These functions simply perform extra checks before calling their regular equivalent, with an extra provision that the operation can’t be as-if’d away (i.e. you have to do the entire memset).
>> 
>> Often, custom memset_s implementations are simple loops (cast to char*, and set each byte to 0), compiled in a different TU, which, amusingly thought LTO and inlining, would totally not obey the “no as-if” rule. Other times they’re implemented in opaque assembly.
>> 
>> Clang doesn’t know about this function, and assumes it’s just another function call. We should tell the compiler about what these functions do so that it knows that stores prior to memset_s are dead, memset_s can’t be removed, what the extra memset_s checks are, and so that we can forward values from memset_s. We can then generate better code (small loops become stores with a loop, memset_s followed by stores get merged, etc).
>> 
>> A few options:
>> 
>> 1. Make them a builtin, have libc implementations forward to the builtin.
>> 2. Teach clang / LLVM about these function’s semantics (i.e. if conditions met, same as memset).
>> 3. Add an attribute which teaches clang about memset-like functions, and use it in libc implementations.
>> 4. Use LTO between the projects and libc implementations, allowing clang to peek into memset_s’s implementation.
>> 
>> I think 1. is the best approach.
>> 
>> 
>> 
>>> On Jan 4, 2019, at 1:35 AM, Jonny Grant via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>>> 
>>> Thank you for your reply Richard.
>>> 
>>> On 03/01/2019 22:04, Richard Smith wrote:
>>>> On Thu, 3 Jan 2019 at 13:44, Jonny Grant via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> <mailto:cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>>> wrote:
>>>>    Hello
>>>>    This file lists part of Annex K  "stdint.h"
>>>>    https://clang.llvm.org/doxygen/stdint_8h_source.html <https://clang.llvm.org/doxygen/stdint_8h_source.html>
>>>>    But main C++ page doesn't mention Annex K. Is Annex K really fully
>>>>    supported?
>>>> That's generally not up to us; that's part of the C standard library, not part of the compiler.
>>>> The one part of Annex K that *is* part of the compiler, according to the usual division of responsibilities, wherein the compiler provides the freestanding headers and the C standard library provides the rest, is the definition of rsize_t in <stddef.h> and the definition of RSIZE_MAX in <stdint.h>, and Clang provides those if __STDC_WANT_LIB_EXT1__ is defined. However, we do not define __STDC_LIB_EXT1__ because, as noted, that's not up to us, and we have no idea what your C standard library supports.
>>> 
>>> I use glibc, it doesn't support Annex K. We are keen to use Annex K functionality, so looking around for options.
>>> 
>>> Do you know if Clang has any intention to develop support for a libc C11 with Annex K Support?
>>> 
>>> 
>>> I'm looking around, and came across this project
>>> https://github.com/rurban/safeclib/blob/master/README <https://github.com/rurban/safeclib/blob/master/README>
>>> 
>>> 
>>>> So in that sense, we implement the part of Annex K that is in our domain.
>>>>    Some background
>>>>    https://clang.llvm.org/compatibility.html <https://clang.llvm.org/compatibility.html>
>>>>    https://clang.llvm.org/cxx_status.html <https://clang.llvm.org/cxx_status.html>
>>>> I'm not sure what these are supposed to show: Annex K is optional in C, and not part of C++.
>>> 
>>> It would be good if what you state could be added to the compatibility page, that Annex K is supported only for stdint.h, but that clang requires a libc which supports C11 Annex K functions/implementation.
>>> 
>>> Cheers, Jonny
>>> 
>>> 
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190104/f3f70365/attachment-0001.html>


More information about the cfe-dev mailing list