[LLVMdev] Handling of unsafe functions

Thu Sep 20 00:54:48 PDT 2012

Hi Nick,

Thanks for taking the time to review the proposal. I'd like to stress out that the purpose of the changes in the proposal are not to improve performance or to fix existing bugs. The purpose is to catch instances where a buffer overrun would happen with specific function calls. With the appropriate input the buffer overrun could lead to undefined behavior or a crash.

What's the current error rate? How often are we seeing bugs in llvm that would be fixed if only we were calling "secure" functions?
[JM] Zero, the motivation is not to fix exiting bugs but potential hidden ones.

What's the impact of calling the secure function? On Release builds and on Debug builds? On size and performance?
[JM] I haven't compared the performance of the secure functions and performance is not mentioned much on the pages I've looked at. However I found at [1] a performance comparison between strcpy and strcpy_s. There appears to be a performance penalty for the use of strcpy_s. In a real case it's very likely that the penalty is hidden by bottlenecks in other places. If the performance degradation is apparent then the proposal can be modified. For example, the *_secure version could do only parameter validation and size checks.

Why not rely on platforms to secure these functions? For instance, Linux and Darwin both have FORTIFY_SOURCE, and I'm too ignorant of Windows to know what the equivalent is there. What about existing tools like valgrind or ASAN?
[JM] I'm not too familiar with Linux specifics and the tools you mentioned but from what I could gather FORTIFY_SOURCE doesn't cover all buffer overflow cases. The cases it covers are exactly the ones I described above, where the size is known at compile time. I don't know of a FORTIFY_SOURCE equivalent for Windows. The proposed solution would work for both Windows and Linux and cover more general cases of buffer overflows.

What happens if memcpy_secure does detect an insecure memcpy? It's considered very rude for LLVM to terminate on the spot since it's often used as a library, so how do we handle the error? By calling llvm::report_fatal_error and hoping we don't recurse? What if it's a debug build and we'd like to see where the code went wrong?
[JM] We use LLVM inside a DLL so I completely sympathize with you about terminating execution on the spot. Although the patch doesn't address this I think calling llvm::report_fatal_error() if the secure functions fail is a good idea because instead of crashing now LLVM can exit gracefully or if an error handler is available then a controlled situation. In Windows the secure functions allow the use of an error handler to be called if parameter validation fails. The custom error handler can dirently call report_fata_error().

How do you plan to enforce that the insecure functions aren't called?
[JM] How about modifying the LLVM programmer's manual should to add a section about the use of secure functions? I can provide a blurb for it.

[1] http://codeketchup.blogspot.com/2012/02/sprintf-strcpy-strncpy-strcpys-what-is.html

Thanks,
Javier

-----Original Message-----
From: Nick Lewycky [mailto:nicholas at mxc.ca] 
Sent: Wednesday, September 19, 2012 2:11 AM
To: Martinez, Javier E
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Handling of unsafe functions

Martinez, Javier E wrote:
> Hello,
>
> We have identified functions in LLVM sources using a static code 
> analyzer which are marked as a "security vulnerability"[1][2]. There 
> has been work already done to address some of them for Linux (e.g.
> snprintf). We are attempting to solve this issue in a comprehensive 
> fashion across all platforms. Most of the functions identified are for 
> manipulating strings. Memcpy is the most commonly used of all these 
> unsecure methods. The following table lists all these functions are 
> their recommended secure alternatives.
>
> Recommended alternatives:
>
> Functions Windows Unix/Mac OS
>
> Memcpy memcpy_s -
>
> Sprint sprintf_s snprintf
>
> Sscanf scanf_s -
>
> _alloca _malloca -
>
> Strcat strcat_s strlcat
>
> Strcpy strcpy_s strlcpy
>
> Strtok strtok_s -
>
> The proposal is to add secure versions of these functions. These 
> functions will be implemented in LLVM Support module and be used by 
> all other LLVM modules. The interface of these methods will be 
> platform independent while their implementation will be platform 
> specific (like the Mutex class in Support module). In cases where the 
> platform does not support the functionality natively, we are writing 
> an implementation of these functions. For example, in the case of 
> memcpy the secure function will look like llvm::memcpy_secure.
>
> Some secure functions require additional data that needs to be passed 
> (like buffer sizes). That information has to be added in all places of 
> invocation. In some cases, this requires an extra size_t argument to 
> be passed through. Hence, this change would not just be a one to one 
> function refactoring. The attached patch helps illustrate how an 
> instance of memcpy would be modified.
>
> Is this proposal of interest to the LLVM community? Can you also 
> comment if the approach specified is good to address this issue?

Personally, I'm not particularly interested in blanket replacement of memcpy with memcpy_s in the hopes that it might close a security hole. I am very interested in fixing any actual bugs. If it's easier to fix real bugs by aggressively using this additional layer, then that may well be the way to go, but before I agree to that, I've got a ton of questions to answer first.

What's the current error rate? How often are we seeing bugs in llvm that would be fixed if only we were calling "secure" functions?

What's the impact of calling the secure function? On Release builds and on Debug builds? On size and performance?

Why not rely on platforms to secure these functions? For instance, Linux and Darwin both have FORTIFY_SOURCE, and I'm too ignorant of Windows to know what the equivalent is there. What about existing tools like valgrind or ASAN?

What happens if memcpy_secure does detect an insecure memcpy? It's considered very rude for LLVM to terminate on the spot since it's often used as a library, so how do we handle the error? By calling llvm::report_fatal_error and hoping we don't recurse? What if it's a debug build and we'd like to see where the code went wrong?

How do you plan to enforce that the insecure functions aren't called?

Nick

> References:
>
> [1] http://msdn.microsoft.com/en-us/library/ms235384(v=vs.80).aspx
>
> [2]
> https://developer.apple.com/library/mac/#documentation/Security/Concep
> tual/SecureCodingGuide/Articles/BufferOverflows.html
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev