[cfe-commits] r62475 - in /cfe/trunk: Driver/PrintParserCallbacks.cpp include/clang/Parse/Action.h lib/Parse/ParseExpr.cpp lib/Parse/ParseStmt.cpp lib/Sema/Sema.h lib/Sema/SemaChecking.cpp lib/Sema/SemaExpr.cpp lib/Sema/SemaOverload.cpp

Howard Hinnant hhinnant at apple.com
Mon Jan 26 18:57:32 PST 2009


On Jan 26, 2009, at 6:55 PM, Douglas Gregor wrote:

>
> On Jan 21, 2009, at 11:52 AM, Daniel Dunbar wrote:
>
>> My performance tester is pointing at this for a 2.5% regression in
>> syntax only time.  Is this inherent and the price we have to pay for
>> cleanup, or unexpected?
>>
>> It would be nice if someone could inspect the code generation for the
>> smart pointers to see if there is any performance we can reclaim.
>
> Now that the low-level optimizations for DISABLE_SMART_POINTERS in  
> r63061are in, I went back to revisit r62475 to see if it's hurting us.
>
> When smart pointers are *enabled*, parsing Cocoa.h (with PTH, - 
> disable-free, and -fsyntax-only) is 3% slower due to r62475. So  
> smart pointers have some cost when they're enabled (as we'd expect).
>
> When smart pointers are *disabled*, parsing Cocoa.h (with PTH, - 
> disable-free, and -fsyntax-only) is .2% slower due to r62475. That's  
> in the noise on my system, so smart pointers have a negligible cost  
> when they're disabled (as we'd hoped).
>
> I've attached a patch that can be used to roll back r62475 for this  
> kind of performance testing (if anyone else wants to try to  
> duplicate my results), but I have no intent on ever applying it.  
> DISABLE_SMART_POINTERS seems to do the job for our purposes.
>
> We've inspected the code generated by smart pointers and raw  
> pointers with the smart pointer functionality disabled (i.e.,  
> DISABLE_SMART_POINTERS is defined) in a micro-benchmark of pointer- 
> passing through the Action interface, and GCC is producing  
> relatively good code. In the micro-benchmark, use of the smart  
> pointer classes when the smart-pointer functionality is disabled  
> only results in a 2% slowdown.
>
> The micro-benchmark itself is attached; apply it to Clang and run  
> Clang on a file containing a single integer to get some simple  
> measurements, e.g.,
>
> Smart pointer time = 0.021950s
> Raw pointer+ExprResult time = 0.025072s
> Raw pointer time = 0.021499s

Just out of curiosity:  Is there a correctness difference between  
these configurations?  I.e. does one leak more memory than another?   
It is always more expensive to call delete than not to.

-Howard




More information about the cfe-commits mailing list