[cfe-commits] r172900 - /cfe/trunk/bindings/python/clang/cindex.py
Tobias Grosser
tobias at grosser.es
Sat Jan 19 09:13:36 PST 2013
On 01/19/2013 05:07 PM, Sean Silva wrote:
> On Sat, Jan 19, 2013 at 6:03 AM, Tobias Grosser
> <grosser at fim.uni-passau.de> wrote:
>> This is a very performance critical point for auto completion. The manual
>> implementation gives a large speedup. As it does not complicate the code a lot,
>> I figured it is worth the change. If anybody understands why the CachedProperty
>> is here so much slower, I am very interested in working on an improvement of
>> CachedProperty.
>
> It's possible that it has to do with the fact that the decorator
> causes at least one extra Python function call, which is one of the
> more expensive operations in CPython. The built-in `property` is coded
> in C (in python/Objects/descrobject.c).
Good point. Part of this may be attributed to the function call
overhead. However, the speedup I got seemed a lot, just for function
call overhead. I need to investigate this more.
> Unfortunately this scenario means that you probably won't be able to
> get equivalent performance from a pure-Python solution (in CPython)
> without doing something very nasty; pypy possibly won't have this
> problem.
Equivalent performance to what? A C implementation of auto complete?
This is probably right. However, in most cases, the python performance
is not the problem as clang is the bottleneck and the python overhead
is not noticeable. There are still some cases where tuning python helps.
In case you are interested, here some profiles from my machine:
-----------------------------------------------------
File: lib/Analysis/ScalarEvolution.cpp
Line: 6643, Column: 6
AU.
libclang code completion - Get TU: 0.003s ( 8.3%)
libclang code completion - Code Complete: 0.029s ( 76.2%)
libclang code completion - Count # Results (22): 0.001s ( 2.6%)
libclang code completion - Filter: 0.000s ( 0.0%)
libclang code completion - Sort: 0.000s ( 0.0%)
libclang code completion - Format: 0.002s ( 6.2%)
libclang code completion - Load into vimscript: 0.001s ( 2.1%)
libclang code completion - vimscript + snippets: 0.002s ( 4.6%)
Overall: 0.039 s
-----------------------------------------------------
File: lib/Analysis/ScalarEvolution.cpp
Line: 7023, Column: 12
std::
libclang code completion - Get TU: 0.008s ( 5.9%)
libclang code completion - Code Complete: 0.046s ( 34.0%)
libclang code completion - Count # Results (768): 0.002s ( 1.2%)
libclang code completion - Filter: 0.000s ( 0.0%)
libclang code completion - Sort: 0.000s ( 0.0%)
libclang code completion - Format: 0.045s ( 33.5%)
libclang code completion - Load into vimscript: 0.007s ( 5.2%)
libclang code completion - vimscript + snippets: 0.027s ( 20.1%)
Overall: 0.136 s
-----------------------------------------------------
In case we complete some object or class with a low number of results,
the run time is almost entirely within clang. Only when we have
something like std:: where we get almost 800 completions, formatting the
results takes time (after my recent changes about 30%). I have a couple
of ideas how to improve on this (further tune python code, implement one
or two super hot functions in clang, only format the results that are
shown to the user). However, further tuning does not seem super critical
at the moment. On my machine all completions already show up without
noticeable delay and even on slower machines, the completion is pretty
fast. Still, if someone has ideas how to reduce the python overhead in
this game further, I am sure there are some people who would appreciate
this.
Tobi
More information about the cfe-commits
mailing list