[LLVMdev] alloc_size metadata

Nuno Lopes nunoplopes at sapo.pt
Wed May 30 08:36:02 PDT 2012


>> >> >> I think this is a good point, here's a suggestion:
>> >> >>
>> >> >> Have the metadata name two functions, both assumed to have the
>> >> >> same signature as the tagged function, one which returns the
>> >> >> offset of the start of the allocated region and one which
>> >> >> returns the length of the allocated region. Alternatively,
>> >> >> these functions could take the same signature and additionally
>> >> >> the returned pointer of the tagged function, and then one
>> >> >> function can return the start of the region and the other the
>> >> >> length.
>> >> >
>> >> > Ok, so this seems to be the most general proposal, which can
>> >> > obviously handle all cases.
>> >>
>> >> I agree.  Variation: have one function return the offset of the
>> >> start of the memory, and the other the offset of the end of the
>> >> memory (or the end plus 1), i.e. a range.  This seems more uniform
>> >> to me, but I don't have a strong preference.
>> >>
>> >> > Something like this would work:
>> >> >
>> >> > define i8* @foo() {
>> >> >     %0 = tail call i32 @get_realloc_size(i8* null, i32 42)
>> >> >     %call = tail call i8* @my_recalloc(i8* null, i32 42)
>> >> > nounwind, !alloc_size !{i32 %0}
>> >> >     ret i8* %call
>> >> > }
>> >> >
>> >> > Basically I just added a function call as the metadata (it's not
>> >> > currently possible to add the function itself to the metadata;
>> >> > the function call is required instead).
>> >> > As long as the function is marked as readnone, I think it
>> >> > shouldn't interfere with the optimizers, and we can have a later
>> >> > pass to drop the metadata and remove the calls.  I still don't
>> >> > like having the explicit calls there, though.  Any suggestions
>> >> > to remove the functions calls from there?
>> >>
>> >> How about this:
>> >>
>> >> define i32 @lo(i32) {
>> >>    ret i32 0
>> >> }
>> >>
>> >> define i32 @hi(i32 %n) {
>> >>    ret i32 %n
>> >> }
>> >>
>> >> declare i8* @wonder_allocator(i32)
>> >>
>> >> define i8* @foo(i32 %n) {
>> >>    %r = call i8* @wonder_allocator(i32 %n), !alloc !0
>> >>    ret i8* %r
>> >> }
>> >>
>> >> !0 = metadata !{ i32 (i32)* @lo, i32 (i32)* @hi }
>> >
>> > This is the format that I had in mind.
>> >
>> >>
>> >>
>> >> The main problem I see is that if you declare @lo and @hi to have
>> >> internal linkage then the optimizers will zap them.  Maybe there's
>> >> a neat solution to that.
>> >
>> > I would consider the optimizer doing this a feature, not a problem.
>> > That having been said, we need to make sure that the optimzer does
>> > not zap them before the analysis/instrumentation passes get to run.
>>
>> This is actually non-trivial to accomplish.
>> Metadata doesn't count as a user, so internal functions with no
>> other usage will get removed.
>
> I thought that it is possible to have passes run before the optimizer
> performs such deletions. Is this not practical? Another option is to
> change the current implementation delete such functions in two phases:
> in the first phase we leave functions with metadata references. In the
> second phase (which runs near the end of the pipeline) we delete
> functions regardless of metadata references.

Right now, if you list the users of a Value, the references coming  
from metadata won't appear. Metadata is not an user and doesn't count  
towards the number of uses of a value.  That's why using anything  
about constant expressions risks disappearing.
Leaving non-used functions to be removed after all optimizations could  
be done. But then you would probably want to, for example, patch the  
pass manager so that it didn't run a function pass over dead  
functions, and so on.


>> On the other hand, we could use
>> @llvm.used to make sure the functions had (at least) one user, but
>> that's probably equivalent to not using internal linkage.
>> And I still want to make sure that these functions disappear in the
>> final binary..
>>
>> Another thing that bothers me is the implementation on the
>> objectsize intrinsic. This intrinsic returns the *constant* size of
>> the pointed object given as argument (if the object has a constant
>> size). However, with this function scheme, the implementation would
>> be a bit heavy, since it would need to inline the @lo and @hi
>> functions, simplify the resulting expression, and then check if the
>> result is a ConstantInt. And remember that in general these functions
>> can be arbitrary complex.
>
> I agree; we'd need to use SCEV or some other heavyweight mechanism to
> do the analysis. In some sense, however, that would be the price of
> generality. On the other hand, I see no reason why we could not write a
> utility function that could accomplish all of that, so we'd only need
> to work out the details once.

SCEV is not the answer here. You just want to know if the result of a  
function is constant given a set of parameters. Inlining +  
simplifications should do it.  But doing an inlining trial is expensive.


> I think that this kind of issue will come up again in the future. Any
> time someone asks, "how can a frontend pass <some complicated
> constraint or information> to the backend", this kind of functionality
> will be needed.

Yes. Maybe we should have a separate mini-expression language for the  
metadata? I dunno if it's worth the effort..

Nuno



More information about the llvm-dev mailing list