[llvm-dev] Clearing the BSS section
devh8h via llvm-dev
llvm-dev at lists.llvm.org
Fri Aug 28 08:47:57 PDT 2015
Without the -disable-simplify-libcalls option, opt generates a call to memset intrinsic, and llc in turn generates a call to __eabi_memset that remains unsolved. That is why I think that compiler-rt is needed here.
I have change the global declaration to :
;———
@__bss_start = external global i32
@__bss_end = weak global i32 0
;———
But no change : the optimized code is still a repeat until.
I have also tried without success :
;———
@__bss_start = extern_weak global i32
@__bss_end = extern_weak global i32
;———
Regards,
Pierre Molinaro
> Le 28 août 2015 à 17:34, John Criswell <jtcriswel at gmail.com> a écrit :
>
> On 8/28/15 11:31 AM, devh8h wrote:
>> I had thought to use the memset intrinsic, unfortunately I did not succeed to cross compiling compiler-rt on my Mac.
>
> I don't think you need compiler-rt to use the memset intrinsic. I think the code generator will generate efficient inline code for it (though I'm not certain).
>
> In any event, Joerg's suggestion of making one external weak sounds a lot easier.
> :)
>
> Regards,
>
> John Criswell
>
>>
>> Regards,
>>
>> Pierre Molinaro
>>
>>
>>
>>> Le 28 août 2015 à 17:22, John Criswell < <mailto:jtcriswel at gmail.com>jtcriswel at gmail.com <mailto:jtcriswel at gmail.com>> a écrit :
>>>
>>> On 8/28/15 11:20 AM, devh8h wrote:
>>>> It is a very basic "blink-led" program, on a Teensy 3.1. There is no operating system. The BSS clear function is called at boot. I would write a general BSS clear function, that behaves correctly even if the BSS section is empty.
>>>
>>> I thought it was something like that.
>>>
>>> Let me know if the memset intrinsic approach works.
>>>
>>> Regards,
>>>
>>> John Criswell
>>>
>>>>
>>>> Thank,
>>>>
>>>> Pierre Molinaro
>>>>
>>>>> Le 28 août 2015 à 17:00, John Criswell <jtcriswel at gmail.com <mailto:jtcriswel at gmail.com>> a écrit :
>>>>>
>>>>> On 8/28/15 10:52 AM, devh8h via llvm-dev wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I am writing a function that clears the BSS section on an Cortex-M4 embedded system.
>>>>> I assume that, for some reason, the operating system is not demand-paging in zeroed memory. Is that correct?
>>>>>
>>>>>> The LLVM (version 3.7.0rc3) code I had wrote is :
>>>>>> ;------------
>>>>>> target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
>>>>>> target triple = "thumbv7em-none--eabi"
>>>>>>
>>>>>> @__bss_start = external global i32
>>>>>> @__bss_end = external global i32
>>>>>>
>>>>>> define void @clearBSS () nounwind {
>>>>>> entry:
>>>>>> br label %bssLoopTest
>>>>>> bssLoopTest:
>>>>>> %p = phi i32* [@__bss_start, %entry], [%p.next, %bssLoop]
>>>>>> %completed = icmp eq i32* %p, @__bss_end
>>>>>> br i1 %completed, label %clearCompleted, label %bssLoop
>>>>>> bssLoop:
>>>>>> store i32 0, i32* %p, align 4
>>>>>> %p.next = getelementptr inbounds i32, i32* %p, i32 1
>>>>>> br label %bssLoopTest
>>>>>> clearCompleted:
>>>>>> ret void
>>>>>> }
>>>>>> ;------------
>>>>>>
>>>>>> This code runs. But when I optimize it with :
>>>>>> opt -disable-simplify-libcalls -Os -S source.ll -o optimized.ll
>>>>>>
>>>>>> I get the following code for the @clearBSS function :
>>>>>> ;------------
>>>>>> define void @clearBSS() nounwind {
>>>>>> entry:
>>>>>> br label %bssLoop
>>>>>>
>>>>>> bssLoop: ; preds = %entry, %bssLoop
>>>>>> %p1 = phi i32* [ @__bss_start, %entry ], [ %p.next, %bssLoop ]
>>>>>> store i32 0, i32* %p1, align 4
>>>>>> %p.next = getelementptr inbounds i32, i32* %p1, i32 1
>>>>>> %completed = icmp eq i32* %p.next, @__bss_end
>>>>>> br i1 %completed, label %clearCompleted, label %bssLoop
>>>>>>
>>>>>> clearCompleted: ; preds = %bssLoop
>>>>>> ret void
>>>>>> }
>>>>>> ;------------
>>>>>> The optimizer has transformed the while loop into a repeat until.
>>>>>>
>>>>>> I think it assumes the two variables @__bss_start and @__bss_end are distinct. But they are solved at link time, and they are the same if the BSS section is empty : in this case, the optimized function fails.
>>>>>>
>>>>>> Is there a way to prevent the optimizer to assume the two variables are distinct ? Or what is the proper way to deal with link time values ?
>>>>> Have you tried using the memset intrinsic? You could case bss_start and bss_end to integers, subtract them to find the length, and then use memset to zero the memory. I would think memset should work if the length is zero.
>>>>>
>>>>> Regards,
>>>>>
>>>>> John Criswell
>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Pierre Molinaro
>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>>>>
>>>>> --
>>>>> John Criswell
>>>>> Assistant Professor
>>>>> Department of Computer Science, University of Rochester
>>>>> http://www.cs.rochester.edu/u/criswell <http://www.cs.rochester.edu/u/criswell>
>>>>>
>>>
>>>
>>> --
>>> John Criswell
>>> Assistant Professor
>>> Department of Computer Science, University of Rochester
>>> http://www.cs.rochester.edu/u/criswell <http://www.cs.rochester.edu/u/criswell>
>>>
>>
>
>
> --
> John Criswell
> Assistant Professor
> Department of Computer Science, University of Rochester
> http://www.cs.rochester.edu/u/criswell <http://www.cs.rochester.edu/u/criswell>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150828/93780e91/attachment.html>
More information about the llvm-dev
mailing list