[cfe-dev] RFC: Source-level attribute for conservative __builtin_object_size(..., 1)
Duncan P. N. Exon Smith via cfe-dev
cfe-dev at lists.llvm.org
Mon Mar 7 15:57:44 PST 2016
> On 2016-Mar-07, at 15:07, Hal Finkel <hfinkel at anl.gov> wrote:
>
> ----- Original Message -----
>> From: "Bob Wilson via cfe-dev" <cfe-dev at lists.llvm.org>
>> To: "Duncan P. N. Exon Smith" <dexonsmith at apple.com>
>> Cc: "Richard Smith" <richard at metafoo.co.uk>, "CFE Dev" <cfe-dev at lists.llvm.org>, "George Burgess IV via llvm-commits"
>> <llvm-commits at lists.llvm.org>
>> Sent: Monday, March 7, 2016 4:30:57 PM
>> Subject: Re: [cfe-dev] RFC: Source-level attribute for conservative __builtin_object_size(..., 1)
>>
>>
>>> On Feb 25, 2016, at 12:43 PM, Duncan P. N. Exon Smith
>>> <dexonsmith at apple.com> wrote:
>>>
>>> Context
>>> =======
>>>
>>> r245403 and follow-ups improved __builtin_object_size(ptr, 1). The
>>> type=1
>>> version of __builtin_object_size returns the maximum possible
>>> distance
>>> between `ptr` and the end of the subobject it's part of. E.g., in:
>>> ```
>>> struct S {
>>> int i[2];
>>> char c[5];
>>> };
>>> ```
>>> then __builtin_object_size(s.i, 1) should return 8, and
>>> __builtin_object_size(s.c + 2, 1) should return 3.
>>>
>>> r250488 relaxed the rules for 0- or 1-sized arrays. Consider
>>> ```
>>> struct S {
>>> int i[2];
>>> char c[0];
>>> };
>>> ```
>>> The expectation is that S will be over-allocated by some number of
>>> bytes,
>>> so that `S::c` has a runtime-determined array size. r250488
>>> changed
>>> __builtin_object_size(ptr, 1) to be more conservative in this case,
>>> falling
>>> back to "type=0". (Type=0 returns the maximum possible distance
>>> between
>>> `ptr` and the end of the allocation.)
>>>
>>> Problem
>>> =======
>>>
>>> We use __builtin_object_size(ptr, 1) in our strcpy implementation
>>> when
>>> _FORTIFY_SOURCE=2 (the default). The code is something equivalent
>>> to:
>>> ```
>>> #define strcpy(a, b, sz) \
>>> __builtin_strcpy_chk(a, b, sz, __builtin_object_size(a, 1))
>>> ```
>>>
>>> This is causing a problem for some
>>> structs in our headers that end with a char array that has a size >
>>> 1 and
>>> that are expected to be over-allocated.
>>>
>>> One example is `sockaddr_un`. From the unix(4) manpage:
>>> http://www.unix.com/man-page/FreeBSD/4/unix/
>>> ```
>>> struct sockaddr_un {
>>> u_char sun_len;
>>> u_char sun_family;
>>> char sun_path[104];
>>> };
>>> ```
>>>
>>> This has been around a long time. Unfortunately,
>>> `sizeof(sockaddr_un)`
>>> cannot really change, and there's a ton of code out there that uses
>>> strcpy() on this struct. We need some way for clang to be
>>> conservative
>>> in this case.
>>>
>>> Solution
>>> ========
>>>
>>> We're thinking about adding an attribute, something like:
>>> ```
>>> struct sockaddr_un {
>>> u_char sun_len;
>>> u_char sun_family;
>>> char sun_path[104] __attribute__((variable_length_array));
>>> };
>>> ```
>>> (is there a better name?). This would allow us to decorate our
>>> headers,
>>> and explicitly opt-in on a struct-by-struct basis to the
>>> conservative
>>> behaviour that r250488 gives to 0- and 1-sized arrays.
>>>
>>> The only other option we can think of us to be conservative
>>> whenever the
>>> the subobject is part of an array at the end of a struct.
>>>
>>> Thoughts?
>>
>> Ping. Does anyone have feedback on this?
>
> If you can add an attribute to your headers, this seems like a good solution. We might also need the conservative option for dealing with older headers. Regarding the attribute, is is really a property of the array member, or a property of the structure (i.e. that it will be overallocated)?
I was thinking only of the narrow fix. But applying it to the struct
seems more interesting. It formalizes other types of over-allocation.
Also (maybe this is where you were going), this would let someone add
the attribute after #include.
--
// Define 'struct something' in a header that can't be changed.
#include <something.h>
// Add an attribute.
struct something __attribute__((overallocated));
--
I like this better.
> FWIW, the proposed attribute seems independently useful for eliminating unnecessary warnings. For example, if we have:
>
> void foo(struct sockaddr_un *s) {
> s->sun_path[106] = 0;
> }
>
> we generate this:
>
> warning: array index 106 is past the end of the array (which contains 104 elements) [-Warray-bounds]
> s->sun_path[106] = 0;
>
> which, as you describe the situation, is not useful.
Good point. I think we could silence this with either attribute
location, right?
More information about the cfe-dev
mailing list