I just finished up a new version of my patch earlier this week, which no longer uses the RecursiveASTVisitor (per chandlerc's suggestion). I've rebased the patch against the current SVN and have attached it for you to test. With the new version only the <meta http-equiv="content-type" content="text/html; charset=utf-8">FixIt/typo.cpp and SemaObjC/synth-provisional-ivars.m tests are failing, and for those two I'm not entirely sure which parts of the behavior are correct/acceptable and which might be a bug in my changes.<div>
<br></div><div>Cheers,</div><div>Kaelyn<br><br><div class="gmail_quote">On Tue, Mar 8, 2011 at 7:46 PM, Douglas Gregor <span dir="ltr"><<a href="mailto:dgregor@apple.com">dgregor@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div class="im"><br>
On Feb 22, 2011, at 11:21 AM, Kaelyn Uhrain wrote:<br>
<br>
> This patch implements checking available C++ namespaces when trying to find typo corrections for unknown identifiers. There are currently 4 broken unit tests (CXX/basic/basic.lookup/basic.lookup.argdep/p4.cpp, CXX/class/class.local/p1.cpp, FixIt/typo.cpp, and SemaObjC/synth-provisional-ivars.m) which I'm fairly sure are cases where the tests need to be updated to reflect the new behavior instead of being indicative of actual code breakage, but I'm not 100% certain. The new unit test, test/Sema/missing-namespace-qualifier-typo-corrections.cpp, is a good demonstration of the improved suggestions provided by this patch as most of the errors found by "clang -fsyntax-only" lack suggestions without this patch..<br>
<br>
</div>This is really cool, and is be a great step forward in usability.<br>
<br>
My only really concern is about performance, so I did a quick measurement. My test was to put all of the C++ Standard Library headers into a PCH file, then include it in a program that amounted to, basically,<br>
<br>
int main() {<br>
vector<int> v;<br>
std::Vector<int> v2;<br>
}<br>
<br>
so there are two typo corrections. And here is some timing data from before/after the patch:<br>
<br>
Before:<br>
<br>
*** AST File Statistics:<br>
18 stat cache hits<br>
2 stat cache misses<br>
97/39174 source location entries read (0.247613%)<br>
11142/18925 types read (58.874504%)<br>
16712/32353 declarations read (51.655178%)<br>
1734/6785 identifiers read (25.556374%)<br>
1861/70867 statements read (2.626046%)<br>
0/1766 macros read (0.000000%)<br>
188/3464 lexical declcontexts read (5.427252%)<br>
43/695 visible declcontexts read (6.187050%)<br>
<br>
Wall time: 0m0.051s<br>
<br>
After:<br>
<br>
*** AST File Statistics:<br>
18 stat cache hits<br>
2 stat cache misses<br>
97/39174 source location entries read (0.247613%)<br>
17162/18925 types read (90.684280%)<br>
28421/32353 declarations read (87.846565%)<br>
3943/6785 identifiers read (58.113487%)<br>
<a href="tel:70371%2F70867">70371/70867</a> statements read (99.300095%)<br>
0/1766 macros read (0.000000%)<br>
3383/3464 lexical declcontexts read (97.661659%)<br>
6080/695 visible declcontexts read (874.820129%)<br>
<br>
Wall time: 0m0.115s<br>
<br>
So it's a little over 2x slower, and (more importantly) deserializes a whole lot more from the PCH file. That's actually a concern, since typo correction still needs to be fast for many applications, and the additional disk I/O really becomes a problem as the PCH files get larger (e.g., containing significant chunks of Boost).<br>
<br>
I have a recommendation. Rather than using the recursive AST visitor to walk the entire AST, can we instead<br>
<br>
(1) First do the "identifier" lookup that we already do, to find the best name that we know about (which could be anywhere in the translation unit), then<br>
(2) If lookup for that identifier fails, go look in all of the namespaces that we know about and see if the identifier is located in one of those.<br>
<br>
We'd have to have some list of known namespaces stored somewhere (and saved in the PCH file for fast retrieval), but at least this way we'd be paying the cost we already pay now to look at the identifiers + a cost that's O(number of unique namespaces).<br>
<br>
Does that seem reasonable?<br>
<br>
<br>
- Doug</blockquote></div><br></div>