[LLVMdev] Handling of unsafe functions

Wed Sep 19 02:10:48 PDT 2012

Martinez, Javier E wrote:
> Hello,
>
> We have identified functions in LLVM sources using a static code
> analyzer which are marked as a “security vulnerability”[1][2]. There has
> been work already done to address some of them for Linux (e.g.
> snprintf). We are attempting to solve this issue in a comprehensive
> fashion across all platforms. Most of the functions identified are for
> manipulating strings. Memcpy is the most commonly used of all these
> unsecure methods. The following table lists all these functions are
> their recommended secure alternatives.
>
> Recommended alternatives:
>
> Functions Windows Unix/Mac OS
>
> Memcpy memcpy_s -
>
> Sprint sprintf_s snprintf
>
> Sscanf scanf_s -
>
> _alloca _malloca -
>
> Strcat strcat_s strlcat
>
> Strcpy strcpy_s strlcpy
>
> Strtok strtok_s -
>
> The proposal is to add secure versions of these functions. These
> functions will be implemented in LLVM Support module and be used by all
> other LLVM modules. The interface of these methods will be platform
> independent while their implementation will be platform specific (like
> the Mutex class in Support module). In cases where the platform does not
> support the functionality natively, we are writing an implementation of
> these functions. For example, in the case of memcpy the secure function
> will look like llvm::memcpy_secure.
>
> Some secure functions require additional data that needs to be passed
> (like buffer sizes). That information has to be added in all places of
> invocation. In some cases, this requires an extra size_t argument to be
> passed through. Hence, this change would not just be a one to one
> function refactoring. The attached patch helps illustrate how an
> instance of memcpy would be modified.
>
> Is this proposal of interest to the LLVM community? Can you also comment
> if the approach specified is good to address this issue?

Personally, I'm not particularly interested in blanket replacement of 
memcpy with memcpy_s in the hopes that it might close a security hole. I 
am very interested in fixing any actual bugs. If it's easier to fix real 
bugs by aggressively using this additional layer, then that may well be 
the way to go, but before I agree to that, I've got a ton of questions 
to answer first.

What's the current error rate? How often are we seeing bugs in llvm that 
would be fixed if only we were calling "secure" functions?

What's the impact of calling the secure function? On Release builds and 
on Debug builds? On size and performance?

Why not rely on platforms to secure these functions? For instance, Linux 
and Darwin both have FORTIFY_SOURCE, and I'm too ignorant of Windows to 
know what the equivalent is there. What about existing tools like 
valgrind or ASAN?

What happens if memcpy_secure does detect an insecure memcpy? It's 
considered very rude for LLVM to terminate on the spot since it's often 
used as a library, so how do we handle the error? By calling 
llvm::report_fatal_error and hoping we don't recurse? What if it's a 
debug build and we'd like to see where the code went wrong?

How do you plan to enforce that the insecure functions aren't called?

Nick

> References:
>
> [1] http://msdn.microsoft.com/en-us/library/ms235384(v=vs.80).aspx
>
> [2]
> https://developer.apple.com/library/mac/#documentation/Security/Conceptual/SecureCodingGuide/Articles/BufferOverflows.html
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev