[LLVMdev] Analysis of polly-detect overhead in oggenc
Star Tan
tanmx_star at yeah.net
Sun Jul 14 08:50:27 PDT 2013
At 2013-07-14 23:30:31,"Sebastian Pop" <sebpop at gmail.com> wrote:
>On Sun, Jul 14, 2013 at 10:17 AM, Star Tan <tanmx_star at yeah.net> wrote:
>> Hi Sebastian,
>>
>> Yes, you have pointed an important reason. If we comment this source code
>> you have listed, then the compile-time overhead for oggenc*8.ll can be
>> reduced from 40.5261 ( 51.2%) to 20.3100 ( 35.7%).
>>
>> I just sent another mail to explain why polly-detect pass leads to
>> significant compile-time overhead. Besides the reason you have pointed,
>> another reason is resulted from those string buffer operations in "INVALID"
>> MACRO. If we comment both the string buffer operations in "INVALID" MACRO
>> and in the "isValidMemoryAccess" function, the compile-time overhead for
>> oggenc*8.ll would be reduced from 40.5261 ( 51.2%) to 5.8813s (15.9%).
>
>Awesome, thanks for the analysis.
>
>Can you run again perf on the resulting program: I would still like to
>understand
>where we spend the 5.88s in the rest of scop detection.
>
The top ten functions reported by perf are:
+ 12.68% opt [kernel.kallsyms] [k] 0xc111472c
+ 9.40% opt libc-2.17.so [.] 0x0007875f
+ 2.98% opt opt [.] __x86.get_pc_thunk.bx
+ 2.42% opt [vdso] [.] 0x00000425
+ 1.46% opt libgmp.so.10.0.5 [.] __gmpn_copyi_x86
+ 1.11% opt libc-2.17.so [.] free
+ 1.07% opt opt [.] bool llvm::DenseMapBase<llvm::DenseMap<void const*, llvm::Pass*, llvm::DenseMapInfo<void const*>
+ 1.02% opt opt [.] llvm::ComputeMaskedBits(llvm::Value*, llvm::APInt&, llvm::APInt&, llvm::DataLayout const*, unsigna
+ 1.00% opt opt [.] llvm::Use::getImpliedUser() const
+ 0.90% opt libgmp.so.10.0.5 [.] __gmpz_set
+ 0.76% opt opt [.] llvm::SmallPtrSetImpl::insert_imp(void const*)
+ 0.74% opt opt [.] llvm::InstCombiner::DoOneIteration(llvm::Function&, unsigned int)
+ 0.73% opt opt [.] llvm::PassRegistry::getPassInfo(void const*) const
+ 0.72% opt libc-2.17.so [.] malloc
+ 0.72% opt opt [.] llvm::TimeRecord::getCurrentTime(bool)
+ 0.71% opt opt [.] llvm::ValueHandleBase::AddToUseList()
+ 0.57% opt opt [.] llvm::SlotTracker::processModule()
+ 0.52% opt opt [.] llvm::PMTopLevelManager::findAnalysisPass(void const*)
+ 0.51% opt opt [.] llvm::APInt::~APInt()
+ 0.51% opt libgmp.so.10.0.5 [.] __gmpz_mul
Unfortunately, I cannot set breakpoints for the top 2 functions.
Even with "perf -g", I still cannot track where time is spent on in Polly-detect pass. The "perf -g" results are like this:
- 12.68% opt [kernel.kallsyms] [k] 0xc111472c `
- 0xc161984a a
- 0xb7783424 a
- 99.93% 0x9ab2000 a
+ 85.99% 0 a
+ 3.14% 0xb46d018 a
+ 1.42% 0xb7745000 a
- 0xc1612415 a
+ 97.76% 0xc1062977 a
+ 1.75% 0xc1495888
Do you have some suggestions?
Best wishes,
Star Tan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130714/d697c503/attachment.html>
More information about the llvm-dev
mailing list