[LLVMdev] RFC: implicit null checks in llvm
Sanjoy Das
sanjoy at playingwithpointers.com
Tue Jun 2 14:42:02 PDT 2015
I decided to go with Andy's suggestion of lowering explicit null
checks into implicit null checks late, after register allocation. The
tip of the change is at http://reviews.llvm.org/D10201.
-- Sanjoy
On Wed, Apr 29, 2015 at 6:52 PM, Andrew Trick <atrick at apple.com> wrote:
>
>> On Apr 24, 2015, at 4:14 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
>>
>> I don't think we can expose the memory operations directly from a
>> semantic, theoretical point of view. Whether practically we can do
>> this or not is a different question.
>>
>> Does LLVM do optimizations like these at the machine instruction
>> level?
>>
>>
>> if (condition)
>> T = *X // normal load, condition guards against null
>>
>> EH_LABEL // clobbers all
>> U = *X // implicit null check, branches out on fault
>> EH_LABEL // clobbers all
>> ...
>>
>> =>
>>
>> since the second "load" from X always happens, X must be
>> dereferenceable
>>
>>
>> T = *X // miscompile here
>>
>> EH_LABEL // clobbers all
>> U = *X // implicit null check, branches out on fault
>> EH_LABEL // clobbers all
>> ...
>>
>> The fundamental problem, of course, is that we're hiding the real
>> control flow which is
>>
>> if (!is_dereferenceable(X)) branch_out;
>> U = *X
>
> That’s a good description of the problem.
>
> Lowering to real loads will *probably* just work because your are being saved by EH_LABEL instructions which are conservatively modeled as having unknown side effects. The feature that saves you will also defeat optimization of those loads. I don't see any advantage of this in terms of optimizing codegen. It is just a workaround to avoid defining pseudo instructions.
>
> The optimal implementation would be to leave the explicit null check in place. Late in the pipeline, just before post-ra scheduling, a pass would combine and+cmp+br+load when it is profitable using target hooks like getLdStBaseRegImmOfsWidth(). Note that we still have alias information in the form of machine mem operands.
>
> You could take a step in that direction without doing much backend work by lowering to pseudo-loads during ISEL instead of using EH_LABEL. Then the various load/store optimizations could be taught to explicitly optimize normal loads and stores over the pseudo loads but not among them.
>
> Andy
More information about the llvm-dev
mailing list