[LLVMdev] alloc_size metadata
John Criswell
criswell at illinois.edu
Fri May 25 09:38:39 PDT 2012
On 5/25/12 10:43 AM, Duncan Sands wrote:
> Hi John,
>
> On 25/05/12 17:22, John Criswell wrote:
>> On 5/25/12 2:16 AM, Duncan Sands wrote:
>>> Hi John,
>>>
>>>>>> I'm implementing the alloc_size function attribute in clang.
>>>>> does anyone actually use this attribute? And if they do, can it
>>>>> really buy
>>>>> them anything? How about "implementing" it by ignoring it!
>>>>
>>> ...
>>>>
>>>> Currently, SAFECode has a pass which just recognizes certain
>>>> functions as
>>>> allocators and knows how to interpret the arguments to find the
>>>> size. If we want
>>>> SAFECode to work with another allocator (like a program's custom
>>>> allocator, the
>>>> Objective-C allocator, the Boehm garbage collector, etc), then that
>>>> pass needs
>>>> to be modified to recognize it. Having to update this pass for
>>>> every allocator
>>>> name and type is one of the few reasons why SAFECode only works
>>>> with C/C++ and
>>>> not just any old language that is compiled down to LLVM IR.
>>>
>>>
>>>> Nuno's proposed feature would allow programmers to communicate the
>>>> relevant
>>>> information about allocators to tools like SAFECode and ASan. I
>>>> think it might
>>>> also make some of the optimizations in LLVM that require knowing about
>>>> allocators work on non-C/C++ code.
>>>
>>> these are good points. The attribute and proposed implementation
>>> feel pretty
>>> clunky though, which is my main gripe.
>>
>> Hrm. I haven't formed an opinion on what the attributes should look
>> like. I
>> think supporting the ones established by GCC would be important for
>> compatibility, and on the surface, they look reasonable. Devising
>> better ones
>> for Clang is fine with me. What about them feels klunky?
>
> basically it feels like "I only know about C, here's something that
> pretends to
> be general but only handles C". Consider a language with a string
> type that
> contains the string length as well as the characters. It has a
> library function
> allocate_string(length). How much does it allocate? length+4 bytes.
> That can't
> be represented by alloc_size. What's more, it may well store the
> length at the
> start, and return a pointer to just after the length: a pointer to the
> first
> character. alloc_size can't represent "the allocated memory starts 4
> bytes
> before the return value" either. In short, it feels like a hack for
> handling
> something that turns up in some particular C code that someone has,
> rather than
> a general solution to the general problem.
True. It also doesn't handle a number of "C" allocators like strdup(),
etc. Making it that general, though, may be tricky, and I don't think
it negates the utility of the simpler form. I suspect a fair number of
allocators could be described by the alloc_size feature.
Even in the C/C++ world, I think it'd be useful. There's the
GC_malloc() in Boehm's garbage collector, kmalloc() in the Linux kernel,
malloc wrappers in applications, memalign(), etc.
>
>>> Since LLVM already has utility functions for recognizing allocators
>>> (i.e. that
>>> know about malloc, realloc and -fno-builtin etc) can't SAFECode just
>>> make use
>>> of them?
>>
>> It probably could. It doesn't simply because SAFECode was written
>> before these
>> features existed within LLVM.
>> :)
>>
>> [snip]
> no, I'm thinking that SAFECode won't need to look at or worry about the
> attribute at all, because the LLVM methods will know about it and serve
> up the appropriate info. Take a look at Analysis/MemoryBuiltins.h. In
> spite of the names, things like extractMallocCall are dealing with
> "malloc
> like" functions, such as C++'s "new" as well as malloc. Similarly for
> calloc. So you could use those right now to extract "malloc" and
> "calloc"
> sizes. If alloc_size is implemented, presumably these would just
> magically
> start to give you useful sizes for functions annotated with that
> attribute too.
I see. That makes sense.
-- John T.
More information about the llvm-dev
mailing list