[LLVMdev] Handling of unsafe functions
silvas at purdue.edu
Tue Sep 18 18:25:02 PDT 2012
I generally disagree with the approach.
Generally char* strings aren't recommended for use in LLVM and this
kind of string manipulation in LLVM shouldn't be done with the
primitive C library functions. The Programmer's Manual gives the
preferred types to use for strings  and all of them keep track of
length. There are also safe routines for creating and formatting
strings, such as raw_ostream which is used pervasively in LLVM.
The example routine in your patch probably should just use
raw_string_ostream or raw_svector_ostream, instead of relying on
C-style string routines. That way, the correctness is enforced by the
compiler, instead of manually laboring over these things (like
checking the return code, which your patch doesn't do...).
In other words, there are completely safe alternatives for these
functions for almost all cases.
One particular use case that usually pertains to memcpy though is when
performance is of significant concern and hence the author "knows what
they are doing" and aren't willing to sacrifice performance calling
into some "secure" version when they have other assurances that the
target buffer has sufficient space. The performance difference can be
significant, since usually memcpy will be turned into a compiler
builtin that the compiler recognizes and optimizes specially, whereas
with the suggested approach, a regular call into a "llvm::*_secure"
wrapper which then calls into the OS-provided general-purpose "secure"
version will happen.
I think that it would be useful if you used the output of your static
analyzer to provide a list of the places where C-style string
manipulation is being done, so that these places can be migrated to
using modern, safe LLVM interfaces for these operations.
On Tue, Sep 18, 2012 at 8:00 PM, Martinez, Javier E
<javier.e.martinez at intel.com> wrote:
> We have identified functions in LLVM sources using a static code analyzer
> which are marked as a “security vulnerability”. There has been work
> already done to address some of them for Linux (e.g. snprintf). We are
> attempting to solve this issue in a comprehensive fashion across all
> platforms. Most of the functions identified are for manipulating strings.
> Memcpy is the most commonly used of all these unsecure methods. The
> following table lists all these functions are their recommended secure
> Recommended alternatives:
> Functions Windows Unix/Mac OS
> Memcpy memcpy_s -
> Sprint sprintf_s snprintf
> Sscanf scanf_s -
> _alloca _malloca -
> Strcat strcat_s strlcat
> Strcpy strcpy_s strlcpy
> Strtok strtok_s -
> The proposal is to add secure versions of these functions. These functions
> will be implemented in LLVM Support module and be used by all other LLVM
> modules. The interface of these methods will be platform independent while
> their implementation will be platform specific (like the Mutex class in
> Support module). In cases where the platform does not support the
> functionality natively, we are writing an implementation of these functions.
> For example, in the case of memcpy the secure function will look like
> Some secure functions require additional data that needs to be passed (like
> buffer sizes). That information has to be added in all places of invocation.
> In some cases, this requires an extra size_t argument to be passed through.
> Hence, this change would not just be a one to one function refactoring. The
> attached patch helps illustrate how an instance of memcpy would be modified.
> Is this proposal of interest to the LLVM community? Can you also comment if
> the approach specified is good to address this issue?
>  http://msdn.microsoft.com/en-us/library/ms235384(v=vs.80).aspx
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
More information about the llvm-dev