[llvm-dev] Clearing the BSS section

devh8h via llvm-dev llvm-dev at lists.llvm.org
Fri Aug 28 08:47:57 PDT 2015


Without the -disable-simplify-libcalls option, opt generates a call to memset intrinsic, and llc in turn generates a call to __eabi_memset that remains unsolved. That is why I think that compiler-rt is needed here.

I have change the global declaration to :
;———
@__bss_start = external global i32
@__bss_end   = weak global i32 0
;———

But no change : the optimized code is still a repeat until.

I have also tried without success :
;———
@__bss_start = extern_weak global i32
@__bss_end   = extern_weak global i32
;———

Regards,

Pierre Molinaro

> Le 28 août 2015 à 17:34, John Criswell <jtcriswel at gmail.com> a écrit :
> 
> On 8/28/15 11:31 AM, devh8h wrote:
>> I had thought to use the memset intrinsic, unfortunately I did not succeed to cross compiling compiler-rt on my Mac.
> 
> I don't think you need compiler-rt to use the memset intrinsic.  I think the code generator will generate efficient inline code for it (though I'm not certain).
> 
> In any event, Joerg's suggestion of making one external weak sounds a lot easier.
> :)
> 
> Regards,
> 
> John Criswell
> 
>> 
>> Regards,
>> 
>> Pierre Molinaro
>> 
>> 
>> 
>>> Le 28 août 2015 à 17:22, John Criswell < <mailto:jtcriswel at gmail.com>jtcriswel at gmail.com <mailto:jtcriswel at gmail.com>> a écrit :
>>> 
>>> On 8/28/15 11:20 AM, devh8h wrote:
>>>> It is a very basic "blink-led" program, on a Teensy 3.1. There is no operating system. The BSS clear function is called at boot. I would write a general BSS clear function, that behaves correctly even if the BSS section is empty.
>>> 
>>> I thought it was something like that.
>>> 
>>> Let me know if the memset intrinsic approach works.
>>> 
>>> Regards,
>>> 
>>> John Criswell
>>> 
>>>> 
>>>> Thank,
>>>> 
>>>> Pierre Molinaro
>>>> 
>>>>> Le 28 août 2015 à 17:00, John Criswell <jtcriswel at gmail.com <mailto:jtcriswel at gmail.com>> a écrit :
>>>>> 
>>>>> On 8/28/15 10:52 AM, devh8h via llvm-dev wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> I am writing a function that clears the BSS section on an Cortex-M4 embedded system.
>>>>> I assume that, for some reason, the operating system is not demand-paging in zeroed memory.  Is that correct?
>>>>> 
>>>>>> The LLVM (version 3.7.0rc3) code I had wrote  is :
>>>>>> ;------------
>>>>>> target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
>>>>>> target triple = "thumbv7em-none--eabi"
>>>>>> 
>>>>>> @__bss_start = external global i32
>>>>>> @__bss_end   = external global i32
>>>>>> 
>>>>>> define void @clearBSS () nounwind {
>>>>>> entry:
>>>>>>   br label %bssLoopTest
>>>>>>  bssLoopTest:
>>>>>>   %p = phi i32* [@__bss_start, %entry], [%p.next, %bssLoop]
>>>>>>   %completed = icmp eq i32* %p, @__bss_end
>>>>>>   br i1 %completed, label %clearCompleted, label %bssLoop
>>>>>>  bssLoop:
>>>>>>   store i32 0, i32* %p, align 4
>>>>>>   %p.next = getelementptr inbounds i32, i32* %p, i32 1
>>>>>>   br label %bssLoopTest
>>>>>>  clearCompleted:
>>>>>>   ret void
>>>>>> }
>>>>>> ;------------
>>>>>> 
>>>>>> This code runs. But when I optimize it with :
>>>>>> opt -disable-simplify-libcalls -Os -S source.ll -o optimized.ll
>>>>>> 
>>>>>> I get the following code for the @clearBSS function :
>>>>>> ;------------
>>>>>> define void @clearBSS() nounwind {
>>>>>> entry:
>>>>>>   br label %bssLoop
>>>>>> 
>>>>>> bssLoop:                                          ; preds = %entry, %bssLoop
>>>>>>   %p1 = phi i32* [ @__bss_start, %entry ], [ %p.next, %bssLoop ]
>>>>>>   store i32 0, i32* %p1, align 4
>>>>>>   %p.next = getelementptr inbounds i32, i32* %p1, i32 1
>>>>>>   %completed = icmp eq i32* %p.next, @__bss_end
>>>>>>   br i1 %completed, label %clearCompleted, label %bssLoop
>>>>>> 
>>>>>> clearCompleted:                                   ; preds = %bssLoop
>>>>>>   ret void
>>>>>> }
>>>>>> ;------------
>>>>>> The optimizer has transformed the while loop into a repeat until.
>>>>>> 
>>>>>> I think it assumes the two variables @__bss_start and @__bss_end are distinct. But they are solved at link time, and they are the same if the BSS section is empty : in this case, the optimized function fails.
>>>>>> 
>>>>>> Is there a way to prevent the optimizer to assume the two variables are distinct ? Or what is the proper way to deal with link time values ?
>>>>> Have you tried using the memset intrinsic?  You could case bss_start and bss_end to integers, subtract them to find the length, and then use memset to zero the memory.  I would think memset should work if the length is zero.
>>>>> 
>>>>> Regards,
>>>>> 
>>>>> John Criswell
>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Pierre Molinaro
>>>>>> 
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>>>> 
>>>>> -- 
>>>>> John Criswell
>>>>> Assistant Professor
>>>>> Department of Computer Science, University of Rochester
>>>>> http://www.cs.rochester.edu/u/criswell <http://www.cs.rochester.edu/u/criswell>
>>>>> 
>>> 
>>> 
>>> -- 
>>> John Criswell
>>> Assistant Professor
>>> Department of Computer Science, University of Rochester
>>> http://www.cs.rochester.edu/u/criswell <http://www.cs.rochester.edu/u/criswell>
>>> 
>> 
> 
> 
> -- 
> John Criswell
> Assistant Professor
> Department of Computer Science, University of Rochester
> http://www.cs.rochester.edu/u/criswell <http://www.cs.rochester.edu/u/criswell>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150828/93780e91/attachment.html>


More information about the llvm-dev mailing list