<div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial"><br><pre>At 2013-07-14 23:30:31,"Sebastian Pop" <sebpop@gmail.com> wrote:
>On Sun, Jul 14, 2013 at 10:17 AM, Star Tan <tanmx_star@yeah.net> wrote:
>> Hi Sebastian,
>>
>> Yes, you have pointed an important reason. If we comment this source code
>> you have listed, then the compile-time overhead for oggenc*8.ll can be
>> reduced from 40.5261 ( 51.2%) to 20.3100 ( 35.7%).
>>
>> I just sent another mail to explain why polly-detect pass leads to
>> significant compile-time overhead. Besides the reason you have pointed,
>> another reason is resulted from those string buffer operations in "INVALID"
>> MACRO. If we comment both the string buffer operations in "INVALID" MACRO
>> and in the "isValidMemoryAccess" function, the compile-time overhead for
>> oggenc*8.ll would be reduced from 40.5261 ( 51.2%) to 5.8813s (15.9%).
>
>Awesome, thanks for the analysis.
>
>Can you run again perf on the resulting program: I would still like to
>understand
>where we spend the 5.88s in the rest of scop detection.
>
The top ten functions reported by perf are:<br>+ 12.68% opt [kernel.kallsyms] [k] 0xc111472c <br>+ 9.40% opt libc-2.17.so [.] 0x0007875f<br>+ 2.98% opt opt [.] __x86.get_pc_thunk.bx <br>+ 2.42% opt [vdso] [.] 0x00000425 <br>+ 1.46% opt libgmp.so.10.0.5 [.] __gmpn_copyi_x86 <br>+ 1.11% opt libc-2.17.so [.] free <br>+ 1.07% opt opt [.] bool llvm::DenseMapBase<llvm::DenseMap<void const*, llvm::Pass*, llvm::DenseMapInfo<void const*><br>+ 1.02% opt opt [.] llvm::ComputeMaskedBits(llvm::Value*, llvm::APInt&, llvm::APInt&, llvm::DataLayout const*, unsigna<br>+ 1.00% opt opt [.] llvm::Use::getImpliedUser() const <br>+ 0.90% opt libgmp.so.10.0.5 [.] __gmpz_set <br>+ 0.76% opt opt [.] llvm::SmallPtrSetImpl::insert_imp(void const*) <br>+ 0.74% opt opt [.] !
llvm::InstCombiner::DoOneIteration(llvm::Function&, unsigned int)<br>+ 0.73% opt opt [.] llvm::PassRegistry::getPassInfo(void const*) const<br>+ 0.72% opt libc-2.17.so [.] malloc <br>+ 0.72% opt opt [.] llvm::TimeRecord::getCurrentTime(bool) <br>+ 0.71% opt opt [.] llvm::ValueHandleBase::AddToUseList() <br>+ 0.57% opt opt [.] llvm::SlotTracker::processModule() <br>+ 0.52% opt opt [.] llvm::PMTopLevelManager::findAnalysisPass(void const*) <br>+ 0.51% opt opt [.] llvm::APInt::~APInt() <br>+ 0.51% opt libgmp.so.10.0.5 [.] __gmpz_mul <br><br>Unfortunately, I cannot set breakpoints for the top 2 functions. <br>Even with "perf -g", I still cannot track where time is spent on in Polly-detect pass. The "perf -g" results are like this:<br>- 12.68% opt [kernel.kallsyms] [k] 0xc111472c !
`<br>
- 0xc161984a a<br> - 0xb7783424 a<br> - 99.93% 0x9ab2000 a<br> + 85.99% 0 a<br> + 3.14% 0xb46d018 a<br> + 1.42% 0xb7745000 a<br> - 0xc1612415 !
a<br> + 97.76% 0xc1062977 a<br> + 1.75% 0xc1495888 <br><br>Do you have some suggestions?<br><br>Best wishes,<br>Star Tan<br></pre></div>