[PATCH] [InstCombine] Lower unknown @llvm.objectsize early.

Sun Feb 1 11:14:01 PST 2015

Sorry for the late reply. Comments inline.

>>> We already try to lower objectsize in InstCombine, but only if the
>>> result is known.  When it is unknown, the intrinsic calls would survive
>>> the mid-level optimizers, to be lowered late, in CodeGenPrepare.
>>>
>>> We can lower them in InstCombine as well, to 0 or -1, depending on the
>>> min argument.
>>> One could argue this is a bit early to do the lowering, since the size
>>> might be made apparent by later optimizations.
>>
>>
>> Yep.
>>
>>   In practice however,
>>>
>>> the majority of cases has a never-known objectsize, and in a lot of
>>> the remaining few, the size is immediately apparent (say from a
>>> global value, or an alloca).
>>
>>
>> Sure I'd expect that, but that isn't really a problem on its own.
>>
>>> Always keeping the intrinsic call intact prevents optimizations, and
>>> makes memcpyopt useless when libcall fortification is enabled (the
>>> default on a few major targets).
>>
>>
>> Can we fix that instead? Not waiting as long as possible to lower unknown
>> llvm.objectsize just can't be right.

Agreed. We should leave objectsize intrinsics until the very last minute.
For example, after inlining, the allocation site may become available and 
then you can compute its size and before inlining you couldn't.

> I think the right way to fix that would be to also turn fortified
> libcalls into intrinsics: if we want to start treating them like their
> non-checking counterparts, we should just represent them the same way.
> The objectsize, whether known or not, should really be something we
> can drop when it's convenient, much like arithmetic flags.
> If it sounds reasonable, I'll tinker with that.
>
> For now, I'm curious: what do you think of the other alternatives
> below?  Say, lowering objectsize right before memcpyopt?

So I would say that after inlining and usual code cleanups (and maybe 
LICM?), no optimization will expose more opportunities for the objectsize 
analysis to kick in.  I would take a look to see if there's a nice place to 
introduce a new pass just to lower objectsize.
The second question is how much do we actually care about these fortified 
functions. If much, then the current memory handling intrinsics could be 
extended with yet another bool parameter that if true it means that write 
ptr should be checked. But then others may also want to check the read ptr, 
or also enable run-time code emission (it's possible to lower objectsize to 
a bunch of instructions to compute the size at run-time).  The number of 
combinations is high..  Probably the current users (e.g., MacOS) don't care 
about all these, though.

>>> Also, Nuno proposes a related improvement to the associated
>>> APIs (MemoryBuiltins.h): "[the analyzer] could be improved to
>>> produce an interval instead.  If we know that the minimum
>>> objectsize is larger than the written size, then the check could
>>> go away, even if we don't know the exact size of the buffer."

Well, that's orthogonal to the issue you raised.  I believe computing the 
range for object sizes would improve things significantly, though. And the 
amount of work required is probably not that much (if reusing the LVI 
analysis).

Nuno