[LLVMdev] why we assume malloc() always returns a non-null pointer in instruction combing?

Wed Apr 1 01:27:00 PDT 2015

Hi Jiangning,

Sorry, I don't buy that argument. I don't see why the compiler statically
emulating the behaviour of a well behaved malloc/free pair is any different
to it inlining a version of strcmp() (the library may have a strcmp that
just returns -1 - the standard says it's allowed to), or doing constant
propagation with well known library calls such as fabs().

The non -ffreestanding behaviour is that the compiler *knows* it is sitting
on top of a C library and it knows vaguely what a C library behaves like.
Granted, malloc() is one of the few C library functions that the compiler
can do something with that can have sideeffects, but removing it completely
is certainly a good thing.

Consider:

int *my_useless_buffer = malloc(LOTS);
for (n : X) {
  my_useless_buffer[0] += n;
}
free(my_useless_buffer);

The compiler would be expected to reduce my_useless_buffer to a single int
and remove the malloc. I agree with David that -ffreestanding is the way to
inform the compiler that it shouldn't make any assumptions about
malloc/free/strcmp/memcpy/memset...

Cheers,

James

On Wed, 1 Apr 2015 at 09:20 David Majnemer <david.majnemer at gmail.com> wrote:

> On Wed, Apr 1, 2015 at 12:15 AM, Kevin Qin <kevinqindev at gmail.com> wrote:
>
>> Hi David and Mats,
>>
>> Thanks for your explanation. If my understanding is correct, it means we
>> don't need to consider the side-effect of malloc/free unless compiling with
>> -ffreestanding. Because without -ffreestanding, user defined malloc/free
>> should be compatible with std library. It makes sense to me.
>>
>> My point is, in std library, malloc is allowed to return null if this
>> malloc failed. Why compiler knows it must succeed at compile time?  I
>> slightly modified the regression case,
>>
>> define i1 @CanWeMallocWithSize(i32 a) {
>> ; CHECK-LABEL: @foo(
>> ; CHECK-NEXT: ret i1 false
>>   %m = call i8* @malloc(i32 a)
>>   %z = icmp eq i8* %m, null
>>   call void @free(i8* %m)
>>   ret i1 %z
>> }
>>
>> It's possible that this function is used to detect whether the runtime
>> environment can malloc a block of memory with size a. Besides, this
>> function can help to apply a large block of memory from system to memory
>> allocator and reduce the system call from a lot of malloc with small size
>> next. At some extreme situations, it may fail to pass this check, then
>> program can show a decent error message and stop. So the problem is, it's
>> not simply malloc a size of memory and then directly free it, but the
>> pointer from malloc is used to compare with null and finally affect the
>> return value. So this optimization may change the original semantic.
>>
>
> A program cannot rely on prior call to a pair of malloc and free to
> suggest that a subsequent call to malloc might succeed.  In fact, a valid
> implementation of a debug malloc might unconditionally report that the nth
> call to malloc will fail in order to help find bugs in a program.
>
>
>>
>>
>> Thanks,
>> Kevin
>>
>
>> 2015-04-01 12:52 GMT+08:00 David Majnemer <david.majnemer at gmail.com>:
>>
>>>
>>>
>>> On Tue, Mar 31, 2015 at 7:59 PM, Jiangning Liu <liujiangning1 at gmail.com>
>>> wrote:
>>>
>>>> Hi Mats,
>>>>
>>>> I think Kevin's point is malloc can return 0, if malloc/free pair is
>>>> optimized way, the semantic of the original would be changed.
>>>>
>>>> On the other hand, malloc/free are special functions, but programmers
>>>> can still define their own versions by not linking std library, so we must
>>>> assume malloc/free always have side-effect like other common functions,
>>>> unless we know we will link std library only at link-time.
>>>>
>>>
>>> If programmers want to do this, they need to compile their program with
>>> -ffreestanding.
>>>
>>>
>>>>
>>>> Thanks,
>>>> -Jiangning
>>>>
>>>>
>>>> 2015-03-31 17:51 GMT+08:00 Kevin Qin <kevinqindev at gmail.com>:
>>>>
>>>>> Yes, I classified `new (std::nothrow)` to be a malloc like allocation.
>>>>> See the next sentence.
>>>>>
>>>>>
>>>>> 2015-03-31 17:48 GMT+08:00 mats petersson <mats at planetcatfish.com>:
>>>>>
>>>>>> > I think we can do such optimization with operator new, because new
>>>>>> never returns null.
>>>>>>
>>>>>> This is incorrect in the case of `new (std::nothrow) ...` - the whole
>>>>>> point of `(std::nothrow)` is to tell new that it should return NULL in
>>>>>> case of failure, rather than throw an exception (bad_alloc).
>>>>>>
>>>>>> But the point here is not the actual return value, but the fact that
>>>>>> the compiler misses that the constructor has side-effects.
>>>>>>
>>>>>> --
>>>>>> Mats
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 31 March 2015 at 10:44, mats petersson <mats at planetcatfish.com>
>>>>>> wrote:
>>>>>> > The optimisation here is that "nothing uses `m`, so we can assume
>>>>>> > allocation works and remove the malloc + free pair".
>>>>>> >
>>>>>> > What is the purpose of allocating 1 (or 100, or 1000000000) bytes,
>>>>>> > never use it, and then free it immediately?
>>>>>> >
>>>>>> > The test-code in the bug report does rely on the constructor being
>>>>>> > called, and the bug here is probably [as I'm not familiar with the
>>>>>> > workings of the compiler in enough detail] that it doesn't recognize
>>>>>> > that the constructor has side-effects.
>>>>>> >
>>>>>> > --
>>>>>> > Mats
>>>>>> >
>>>>>> > On 31 March 2015 at 10:24, Kevin Qin <kevinqindev at gmail.com> wrote:
>>>>>> >> Hi,
>>>>>> >>
>>>>>> >>
>>>>>> >> When looking into the bug in
>>>>>> https://llvm.org/bugs/show_bug.cgi?id=21421, I
>>>>>> >> found a regression test in
>>>>>> Transforms/InstCombine/malloc-free-delete.ll
>>>>>> >> against me to directly fix it. The test is,
>>>>>> >>
>>>>>> >> define i1 @foo() {
>>>>>> >> ; CHECK-LABEL: @foo(
>>>>>> >> ; CHECK-NEXT: ret i1 false
>>>>>> >>   %m = call i8* @malloc(i32 1)
>>>>>> >>   %z = icmp eq i8* %m, null
>>>>>> >>   call void @free(i8* %m)
>>>>>> >>   ret i1 %z
>>>>>> >> }
>>>>>> >>
>>>>>> >> According to http://www.cplusplus.com/reference/cstdlib/malloc/,
>>>>>> malloc may
>>>>>> >> return null if this memory allocation fails. So why we assume
>>>>>> malloc()
>>>>>> >> always returns a non-null pointer here?
>>>>>> >>
>>>>>> >> I think we can do such optimization with operator new, because new
>>>>>> never
>>>>>> >> returns null. But for all malloc like allocation(malloc, calloc,
>>>>>> and new
>>>>>> >> with std::nothrow), we shouldn't do this.
>>>>>> >>
>>>>>> >> That regression test exists for a long time, I'm not sure if
>>>>>> there's any
>>>>>> >> special reason. Does anybody know about this?
>>>>>> >>
>>>>>> >> --
>>>>>> >> Thanks,
>>>>>> >>
>>>>>> >> Kevin Qin
>>>>>> >>
>>>>>> >> _______________________________________________
>>>>>> >> LLVM Developers mailing list
>>>>>> >> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>> >>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>>
>>>>> Kevin Qin
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>>
>>>
>>
>>
>> --
>> Best Regards,
>>
>> Kevin Qin
>>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150401/575487b1/attachment.html>