[PATCH] Compile time reduction for small programs.

Andrew Trick atrick at apple.com
Wed Feb 5 09:43:43 PST 2014


On Feb 4, 2014, at 11:41 PM, Eric Christopher <echristo at gmail.com> wrote:

> On Tue, Feb 4, 2014 at 10:26 PM, Puyan Lotfi <plotfi at apple.com> wrote:
>> 
>> On Feb 4, 2014, at 10:16 PM, Eric Christopher <echristo at gmail.com> wrote:
>> 
>>> On Tue, Feb 4, 2014 at 9:47 PM, Puyan Lotfi <plotfi at apple.com> wrote:
>>>> 
>>>> On Feb 4, 2014, at 6:52 PM, Eric Christopher <echristo at gmail.com> wrote:
>>>> 
>>>>> Hey cool,
>>>>> 
>>>>> +/// reinitPhysRegEntries - Initializes PhysRegEntries to new
>>>>> +/// zeroed out memory of size NumPhysRegs. This is faster than using
>>>>> +/// llvm::SmallVector::assign(N, 0). It's used in the below code to
>>>>> +/// also supports when a pass manager could be reused for different
>>>>> +/// but similar targets that could have varying numbers of PhysRegs.
>>>>> +void InterferenceCache::reinitPhysRegEntries() {
>>>>> +  if (PhysRegEntriesCount == TRI->getNumRegs()) return;
>>>>> +  free(PhysRegEntries);
>>>>> +  PhysRegEntriesCount = TRI->getNumRegs();
>>>>> +  PhysRegEntries = (unsigned char*) calloc(PhysRegEntriesCount,
>>>>> sizeof(unsigned char));
>>>>> +}
>>>>> 
>>>>> Looks to be faster since we're not actually zeroing out the array if
>>>>> it's the same size as the previous allocation?
>>>>> 
>>>> 
>>>> Not exactly; the compile time improvement gained here over the current implementation comes from the fact that we're using a SmallVector:
>>>> 
>>>> SmallVector<unsigned char, 2> PhysRegEntries;
>>>> 
>>>> that we zero out using assign:
>>>> 
>>>> PhysRegEntries.assign(TRI->getNumRegs(), 0);
>>>> 
>>>> when the SmallVector might not actually be that small at all.
>>>> 
>>>> Compile time was being spent in zeroing out the SmallVector when calling llvm::SmallVector::assign, which uses an iterator to assign each element in the SmallVector to zero.
>>>> I found that just by changing the SmallVector to a std::vector I got a compile time reduction since std::vector ends up using __platform_bzero to do the job.
>>>> 
>>>> Jakob suggested I just calloc the memory in the first place, rather than using a std::vector or calling bzero.
>>>> And actually, it would be nice if SmallVector had some platform specific speed improvements eventually, the way that std::vector does.
>>>> 
>>> 
>>> Yeah, it would be. That's a bit crazy.
>>> 
>>> At any rate, I was wondering why reinit didn't 0 the mem if it
>>> happened to be the same size?
>>> 
>> 
>> Hmm, I am actually not sure what is correct in this case. I'll have to ask Jakob.
>> But the reinit here is more or less a precaution.
>> I believe the existing code would have used the same already populated SmallVector memory if the pass manager we to be reused too.
>> 
> 
> Right, I'm just confused as to the seeming (?) assumption that a
> reused pass manager with a different machine would have the same
> registers in the phys reg map? Or maybe I'm confused :)

If the pass is reused for a machine with a different number of physregs, then the PhysRegEntries array needs to be resized.

If the pass is reused for the same machine (or a machine with same # physregs), it does not need to be resized.

PhysRegEntries does not need to be zeroed between passes, because it is safe for the entries to contain garbage. It works like a SparseSet. The entry is only valid if its corresponding CacheEntries bucket has a matching register assignment.

Puyan, it would be good to add a small comment explaining why the PhysRegEntries do not need to be zero upon initialization.

As for why we do the calloc the first time, it is just good form. And probably prevents valgrind from complaining.

-Andy

> 
> -eric
> 
>>>> 
>>>>> +    for (auto RegI = PhysRegs.begin(), E = PhysRegs.end(); RegI != E; ++RegI)
>>>>> +      if (!MRI->reg_nodbg_empty(*RegI))
>>>>> +        MRI->setPhysRegUsed(*RegI);
>>>>> 
>>>>> Can't use auto quite yet :)
>>>>> 
>>>> 
>>>> Doe! I was under the impression that C++11 support had been enabled several weeks ago.
>>> 
>>> Almost there :)
>>> 
>>> -eric
>>> 
>>>> 
>>>>> 
>>>>> -eric
>>>>> 
>>>> 
>>>> -Thanks
>>>> 
>>>> Puyan
>>>> 
>>>>> On Tue, Feb 4, 2014 at 3:44 PM, Puyan Lotfi <plotfi at apple.com> wrote:
>>>>>> 
>>>>>> On Feb 4, 2014, at 2:52 PM, Eric Christopher <echristo at gmail.com> wrote:
>>>>>> 
>>>>>>> On Tue, Feb 4, 2014 at 2:47 PM, Andrew Trick <atrick at apple.com> wrote:
>>>>>>>> On Feb 4, 2014, at 2:41 PM, Puyan Lotfi <plotfi at apple.com> wrote:
>>>>>>>> 
>>>>>>>>> Here is a patch for reducing compile time on small programs, especially if the target has a large register file.
>>>>>>>>> I've already done sanity checks to determine that it doesn't regress compile time on large programs.
>>>>>>>> 
>>>>>>>> LGTM. (mainly I'm trusting Jakob's informal review earlier).
>>>>>>> 
>>>>>>> Knowing this helps :)
>>>>>>> 
>>>>>>> It'd be nice to get some comments in the code as to what's going on
>>>>>>> where. There aren't a lot in there as it is, but getting some more
>>>>>>> would be nice.
>>>>>>> 
>>>>>>> -eric
>>>>>>> 
>>>>>> 
>>>>>> Here's a better commented version, Eric:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -Puyan
>>>>>> 
>>>>>> 
>>>>>>>> -Andy
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -Puyan
>>>>>>>>> 
>>>>>>>>> <CompileTimeReductionSmall.patch>_______________________________________________
>>>>>>>>> llvm-commits mailing list
>>>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> llvm-commits mailing list
>>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140205/86adcd76/attachment.html>


More information about the llvm-commits mailing list