[LLVMdev] alloc_size metadata

Duncan Sands baldrick at free.fr
Fri May 25 08:43:52 PDT 2012


Hi John,

On 25/05/12 17:22, John Criswell wrote:
> On 5/25/12 2:16 AM, Duncan Sands wrote:
>> Hi John,
>>
>>>>> I'm implementing the alloc_size function attribute in clang.
>>>> does anyone actually use this attribute? And if they do, can it really buy
>>>> them anything? How about "implementing" it by ignoring it!
>>>
>> ...
>>>
>>> Currently, SAFECode has a pass which just recognizes certain functions as
>>> allocators and knows how to interpret the arguments to find the size. If we want
>>> SAFECode to work with another allocator (like a program's custom allocator, the
>>> Objective-C allocator, the Boehm garbage collector, etc), then that pass needs
>>> to be modified to recognize it. Having to update this pass for every allocator
>>> name and type is one of the few reasons why SAFECode only works with C/C++ and
>>> not just any old language that is compiled down to LLVM IR.
>>
>>
>>> Nuno's proposed feature would allow programmers to communicate the relevant
>>> information about allocators to tools like SAFECode and ASan. I think it might
>>> also make some of the optimizations in LLVM that require knowing about
>>> allocators work on non-C/C++ code.
>>
>> these are good points. The attribute and proposed implementation feel pretty
>> clunky though, which is my main gripe.
>
> Hrm. I haven't formed an opinion on what the attributes should look like. I
> think supporting the ones established by GCC would be important for
> compatibility, and on the surface, they look reasonable. Devising better ones
> for Clang is fine with me. What about them feels klunky?

basically it feels like "I only know about C, here's something that pretends to
be general but only handles C".  Consider a language with a string type that
contains the string length as well as the characters.  It has a library function
allocate_string(length).  How much does it allocate?  length+4 bytes. That can't
be represented by alloc_size.  What's more, it may well store the length at the
start, and return a pointer to just after the length: a pointer to the first
character.  alloc_size can't represent "the allocated memory starts 4 bytes
before the return value" either.  In short, it feels like a hack for handling
something that turns up in some particular C code that someone has, rather than
a general solution to the general problem.

>> Since LLVM already has utility functions for recognizing allocators (i.e. that
>> know about malloc, realloc and -fno-builtin etc) can't SAFECode just make use
>> of them?
>
> It probably could. It doesn't simply because SAFECode was written before these
> features existed within LLVM.
> :)
>
>> Then either (1) something like alloc_size is implemented, the LLVM
>> utility learns about it, and SAFECode benefits automagically, or (2) the LLVM
>> utility is taught about other allocators like Ada's, and SAFECode benefits
>> automagically.
>
> I'm not sure what you mean by "LLVM utility," but I think we're thinking along
> the same lines. Clang/LLVM implement the alloc_size attributes, we change
> SAFECode to recognize it, and so when people use it, SAFECode benefits
> automagically.
>
> Am I right that we're thinking the same thing, or did I completely misunderstand
> you?

no, I'm thinking that SAFECode won't need to look at or worry about the
attribute at all, because the LLVM methods will know about it and serve
up the appropriate info.  Take a look at Analysis/MemoryBuiltins.h.  In
spite of the names, things like extractMallocCall are dealing with "malloc
like" functions, such as C++'s "new" as well as malloc.  Similarly for
calloc.  So you could use those right now to extract "malloc" and "calloc"
sizes.  If alloc_size is implemented, presumably these would just magically
start to give you useful sizes for functions annotated with that attribute too.

Ciao, Duncan.



More information about the llvm-dev mailing list