[LLVMdev] alloc_size metadata

Nuno Lopes nunoplopes at sapo.pt
Thu May 31 23:37:26 PDT 2012


Hi,

Sorry for the delay; comments below.

>>>> This is actually non-trivial to accomplish.
>>>> Metadata doesn't count as a user, so internal functions with no
>>>> other usage will get removed.
>>>
>>> I thought that it is possible to have passes run before the optimizer
>>> performs such deletions. Is this not practical? Another option is to
>>> change the current implementation delete such functions in two phases:
>>> in the first phase we leave functions with metadata references. In the
>>> second phase (which runs near the end of the pipeline) we delete
>>> functions regardless of metadata references.
>>
>> Right now, if you list the users of a Value, the references coming
>> from metadata won't appear. Metadata is not an user and doesn't count
>> towards the number of uses of a value.  That's why using anything
>> about constant expressions risks disappearing.
>> Leaving non-used functions to be removed after all optimizations could
>> be done. But then you would probably want to, for example, patch the
>> pass manager so that it didn't run a function pass over dead
>> functions, and so on.
>
> the functions could be declared to have linkonce_odr linkage.  That way
> they will be zapped after the inliner runs, but shouldn't be removed
> before.

I'm certainly not convinced. You cannot force all analysis to be run before 
inlining. You're basically saying that all passes that do analysis on buffer 
size must run quite early. The inliner is run pretty early!
At least in the case of the buffer overflow pass, I want it to run late, 
after most cleanups were done. Asan does exactly the same.


>>>> Another thing that bothers me is the implementation on the
>>>> objectsize intrinsic. This intrinsic returns the *constant* size of
>>>> the pointed object given as argument (if the object has a constant
>>>> size). However, with this function scheme, the implementation would
>>>> be a bit heavy, since it would need to inline the @lo and @hi
>>>> functions, simplify the resulting expression, and then check if the
>>>> result is a ConstantInt. And remember that in general these functions
>>>> can be arbitrary complex.
>>>
>>> I agree; we'd need to use SCEV or some other heavyweight mechanism to
>>> do the analysis. In some sense, however, that would be the price of
>>> generality. On the other hand, I see no reason why we could not write a
>>> utility function that could accomplish all of that, so we'd only need
>>> to work out the details once.
>>
>> SCEV is not the answer here. You just want to know if the result of a
>> function is constant given a set of parameters. Inlining +
>> simplifications should do it.  But doing an inlining trial is expensive.
>
> The hi/lo functions could be declared always_inline.  Thus they will 
> always
> be inlined, either by the always-inliner pass or the usual one.  You would
> need to insert the instrumentation code or whatever that uses hi/lo before
> any inliner runs, and run optimizations such as turning objectsize into a
> constant after the inliner runs.

The semantics of the objectsize intrinsic is that it returns a constant 
value if it can figure out the objectsize, and return 0/-1 otherwise. So you 
cannot simply inline the functions and hope for the best.  You need to run 
an inline trial:  inline; try to fold the resulting expression into a 
constant; remove inlined code if it didn't fold to a constant.
You may say this is the price of generality. I don't know how slow would it 
be, though.


Today, after playing around with these things, I found another problem: 
inlining functions with this alloc metadata.  Assuming that we attach the 
metadata to call sites in the front-end, if the function later gets inlined, 
then the metadata is lost.  We can, however, allow the metadata to be 
attached to arbitrary instructions, so that the inliner can be taught to 
attach it to the returned expression.

Nuno 




More information about the llvm-dev mailing list