[cfe-dev] Clang Preprocessor Speed Up

Andy Gibbs via cfe-dev cfe-dev at lists.llvm.org
Sat Jun 4 00:43:06 PDT 2016


On Friday 03 Jun 2016 14:33:37 Андрей Серебро wrote:
> Ok, that would be a miracle.
>  
> As I understand, unfortunately getting rid of proper source locations causes
> big problems, yet it's currently hard for me to find some small samples on
> which everything crashes. I mean, it would be great to reduce the crash
> code dramatically, for now the examples of misbehavior are huge.  
> I think a good understanding of interactions between SourceManager, Lexer
> and Parser is needed now because it's somewhere in between their
> communication, so to say, which turns to be broken after the patch. I see a
> possibility in trying to save proper SourceLocations, yet I'm not sure it
> will be such an easy feature. 
> I wonder if I can find some person who deeply understands the concept of
> SourceLocations so that it's possible to discuss the underlying problems?
> Guys, I need your piece of advice :|

My suggestion, but others should jump in here too, is that you split the patch 
up into two separate aims.

The first aim would be the reduction of SourceLocation information.

The second aim would be caching of previous macro expansions (from what I 
understand of your patch, I think this is what you are attempting?).

I think this would be a benefit since it may be easier to get the patches 
working if you work on the two parts independently.  Plus you indicated that 
the first still has a sizeable benefit.  And I think the second will give you 
problems (detailed below).

I'm afraid I am not able to help you much on the interactions of the 
SourceManager, Lexer and Parser.  There are others on this mailing list who 
would be able to help you much better in this regard.

Regarding reduction of SourceLocation information, these are my thoughts.  
Please note that I haven't assessed the practicality of this, and I'm writing 
as I think so it may not even be coherent!  Caveats aside, you talked 
previously of having this as a compiler switch.  I think this is a good idea 
since the SourceLocation information is needed for debugging macro expansions.  
I would suggest implementing this switch as a depth counter so that 
SourceLocation information is only once a macro expansion reaches a certain 
depth the SourceLocation information will be elided.  In a sense, the 
important information of a macro expansion is the SourceLocations of the 
original tokens going into the macro and the SourceLocation of the final 
scratch buffer.  At this point, I'm wondering whether it is useful to have 
SourceLocation information for concatenated tokens... Maybe someone else can 
give there thoughts here.

The other thing you can think about here is that of actually releasing a 
scratch buffer no longer in use.  I think clang doesn't do this currently 
(probably because of keeping all the SourceLocation information), but if you 
looked at having a reference count on a scratch buffer then maybe it will be 
possible to release it once its last use is complete.  This has the benefit of 
also freeing memory... and may also improve the speed of the compiler.  Again, 
others should give their thoughts here.  Personally, I would be very 
interested to see if this was a possible line of improvement since with 
extensive macro expansions, the memory consumption of clang is huge.

In fact, thinking about this, this may be the first place to start -- it is 
sort of linked to eliding SourceLocation information, but it has the advantage 
that you are freeing the whole scratch buffer since it has been incorporated 
into another higher-level scratch buffer, and so is no longer referenced 
anywhere and its disappearance shouldn't (hopefully!) cause so many problems 
in clang diagnostics, for example, which even so would need to be adapted to 
skip these elided macro expansion steps.

Finally, regarding the caching of previous macro expansions, you need to be 
very careful because macro definitions can change, yes even inside a macro 
expansion too (think pragma pop_macro).  For that matter, you need to think 
about how to handle any _Pragma expansion since these can cause side effects.  
This is why I would leave this side of your patch until last.  It will be the 
hardest to test and guarantee correctness.

Keep up the good work!  :o)

Cheers,
Andy




More information about the cfe-dev mailing list