[PATCH] Compile time reduction for small programs.

Tue Feb 4 22:16:21 PST 2014

On Tue, Feb 4, 2014 at 9:47 PM, Puyan Lotfi <plotfi at apple.com> wrote:
>
> On Feb 4, 2014, at 6:52 PM, Eric Christopher <echristo at gmail.com> wrote:
>
>> Hey cool,
>>
>> +/// reinitPhysRegEntries - Initializes PhysRegEntries to new
>> +/// zeroed out memory of size NumPhysRegs. This is faster than using
>> +/// llvm::SmallVector::assign(N, 0). It's used in the below code to
>> +/// also supports when a pass manager could be reused for different
>> +/// but similar targets that could have varying numbers of PhysRegs.
>> +void InterferenceCache::reinitPhysRegEntries() {
>> +  if (PhysRegEntriesCount == TRI->getNumRegs()) return;
>> +  free(PhysRegEntries);
>> +  PhysRegEntriesCount = TRI->getNumRegs();
>> +  PhysRegEntries = (unsigned char*) calloc(PhysRegEntriesCount,
>> sizeof(unsigned char));
>> +}
>>
>> Looks to be faster since we're not actually zeroing out the array if
>> it's the same size as the previous allocation?
>>
>
> Not exactly; the compile time improvement gained here over the current implementation comes from the fact that we're using a SmallVector:
>
> SmallVector<unsigned char, 2> PhysRegEntries;
>
> that we zero out using assign:
>
> PhysRegEntries.assign(TRI->getNumRegs(), 0);
>
> when the SmallVector might not actually be that small at all.
>
> Compile time was being spent in zeroing out the SmallVector when calling llvm::SmallVector::assign, which uses an iterator to assign each element in the SmallVector to zero.
> I found that just by changing the SmallVector to a std::vector I got a compile time reduction since std::vector ends up using __platform_bzero to do the job.
>
> Jakob suggested I just calloc the memory in the first place, rather than using a std::vector or calling bzero.
> And actually, it would be nice if SmallVector had some platform specific speed improvements eventually, the way that std::vector does.
>

Yeah, it would be. That's a bit crazy.

At any rate, I was wondering why reinit didn't 0 the mem if it
happened to be the same size?

>
>> +    for (auto RegI = PhysRegs.begin(), E = PhysRegs.end(); RegI != E; ++RegI)
>> +      if (!MRI->reg_nodbg_empty(*RegI))
>> +        MRI->setPhysRegUsed(*RegI);
>>
>> Can't use auto quite yet :)
>>
>
> Doe! I was under the impression that C++11 support had been enabled several weeks ago.

Almost there :)

-eric

>
>>
>> -eric
>>
>
> -Thanks
>
> Puyan
>
>> On Tue, Feb 4, 2014 at 3:44 PM, Puyan Lotfi <plotfi at apple.com> wrote:
>>>
>>> On Feb 4, 2014, at 2:52 PM, Eric Christopher <echristo at gmail.com> wrote:
>>>
>>>> On Tue, Feb 4, 2014 at 2:47 PM, Andrew Trick <atrick at apple.com> wrote:
>>>>> On Feb 4, 2014, at 2:41 PM, Puyan Lotfi <plotfi at apple.com> wrote:
>>>>>
>>>>>> Here is a patch for reducing compile time on small programs, especially if the target has a large register file.
>>>>>> I've already done sanity checks to determine that it doesn't regress compile time on large programs.
>>>>>
>>>>> LGTM. (mainly I'm trusting Jakob's informal review earlier).
>>>>
>>>> Knowing this helps :)
>>>>
>>>> It'd be nice to get some comments in the code as to what's going on
>>>> where. There aren't a lot in there as it is, but getting some more
>>>> would be nice.
>>>>
>>>> -eric
>>>>
>>>
>>> Here's a better commented version, Eric:
>>>
>>>
>>>
>>>
>>> -Puyan
>>>
>>>
>>>>> -Andy
>>>>>
>>>>>>
>>>>>> -Puyan
>>>>>>
>>>>>> <CompileTimeReductionSmall.patch>_______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>>
>