RFC: Source-level attribute for conservative __builtin_object_size(..., 1)

Duncan P. N. Exon Smith via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 25 12:43:47 PST 2016


Context
=======

r245403 and follow-ups improved __builtin_object_size(ptr, 1).  The type=1
version of __builtin_object_size returns the maximum possible distance
between `ptr` and the end of the subobject it's part of.  E.g., in:
```
struct S {
   int i[2];
   char c[5];
};
```
then __builtin_object_size(s.i, 1) should return 8, and
__builtin_object_size(s.c + 2, 1) should return 3.

r250488 relaxed the rules for 0- or 1-sized arrays.  Consider
```
struct S {
   int i[2];
   char c[0];
};
```
The expectation is that S will be over-allocated by some number of bytes,
so that `S::c` has a runtime-determined array size.  r250488 changed
__builtin_object_size(ptr, 1) to be more conservative in this case, falling
back to "type=0".  (Type=0 returns the maximum possible distance between
`ptr` and the end of the allocation.)

Problem
=======

We use __builtin_object_size(ptr, 1) in our strcpy implementation when
_FORTIFY_SOURCE=2 (the default).  The code is something equivalent to:
```
#define strcpy(a, b, sz) \
   __builtin_strcpy_chk(a, b, sz, __builtin_object_size(a, 1))
```

This is causing a problem for some
structs in our headers that end with a char array that has a size > 1 and
that are expected to be over-allocated.

One example is `sockaddr_un`.  From the unix(4) manpage:
http://www.unix.com/man-page/FreeBSD/4/unix/
```
struct sockaddr_un {
   u_char  sun_len;
   u_char  sun_family;
   char    sun_path[104];
};
```

This has been around a long time.  Unfortunately, `sizeof(sockaddr_un)`
cannot really change, and there's a ton of code out there that uses
strcpy() on this struct.  We need some way for clang to be conservative
in this case.

Solution
========

We're thinking about adding an attribute, something like:
```
struct sockaddr_un {
   u_char  sun_len;
   u_char  sun_family;
   char    sun_path[104] __attribute__((variable_length_array));
};
```
(is there a better name?).  This would allow us to decorate our headers,
and explicitly opt-in on a struct-by-struct basis to the conservative
behaviour that r250488 gives to 0- and 1-sized arrays.

The only other option we can think of us to be conservative whenever the
the subobject is part of an array at the end of a struct.

Thoughts?


More information about the llvm-commits mailing list