<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Oct 9, 2014 at 9:58 AM, Kaelyn Takata <span dir="ltr"><<a href="mailto:rikka@google.com" target="_blank">rikka@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="h5">On Tue, Oct 7, 2014 at 5:03 PM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><div><div>On Tue, Oct 7, 2014 at 2:29 PM, Kaelyn Takata <span dir="ltr"><<a href="mailto:rikka@google.com" target="_blank">rikka@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><div><div>On Tue, Oct 7, 2014 at 11:43 AM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span>On Mon, Oct 6, 2014 at 3:46 PM, Kaelyn Takata <span dir="ltr"><<a href="mailto:rikka@google.com" target="_blank">rikka@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span>On Fri, Oct 3, 2014 at 11:45 AM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br></span><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">Given that these are parts of a patch series, some of which have already been signed off on/I haven't looked at, it's a bit hard to review the overall design - but I believe you & Richard talked that over, so I'm not too worried about that (just explaining why my review feedback's going to be a bit narrow)<br><br><pre style="color:rgb(0,0,0)">+ if (NumTypos > 0 && !ExprEvalContexts.empty())
+ ExprEvalContexts.back().NumTypos += NumTypos;
+ else
+ assert(NumTypos == 0 && "There are outstanding typos after popping the "
+ "last ExpressionEvaluationContextRecord");
Could this be simplified? I think it's equivalent to:
if (!ExprEvalContexts.empty())
ExprEvalContexts.back().NumTypos += NumTypos;
else
assert(NumTypos == 0 ... );
or:
assert(ExprEvalContexts.empty() || NumTypos != 0 && ...)
if (!ExprEvalContexts.empty())
...</pre></div></blockquote></span><div>Yes, the "NumTypos > 0" was unnecessary and has been removed. (FYI, the second form you suggested doesn't quite work because the assert would fail on the common case where ExprEvalContexts is not empty and there are no typos.)<br></div></div></div></div></blockquote><div><br></div></span><div>Hmm, right, maybe:<br><br> assert(ExprEvalContexts.empty() ^ NumTypos != 0 && ... )<span><br> if (!ExprEvalContexts.empty())<br> ExprEvalContexts.back().NumTypos += NumTypos;<br><br></span>but perhaps that's just too confusing. I just find an assertion as the only thing inside a block of code to be a bit off... but it's certainly not unprecedented in LLVM and is probably more legible - just my own hangup to get over :)</div><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><pre style="color:rgb(0,0,0)">There's an unnecessary semicolon at the end of the "while (true)" loop in <span style="font-family:arial">TransformTypos::Transform</span></pre></div></blockquote></span><div>Removed. <br></div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><pre style="color:rgb(0,0,0)"><br>
I'd roll this up a bit:
<pre>+ auto &State = SemaRef.getTypoExprState(TE);
+ if (State.DiagHandler)
+ (*State.DiagHandler)(State.Consumer->getCurrentCorrection());
into:
if (auto *Handler = SemaRef.getTypoExprState(TE).DiagHandler)
(*DiagHandler)(State.Consumer->getCurrentCorrection());</pre></pre></div></blockquote></span><div>That rollup doesn't work because State would be unknown when accessing State.Consumer in the argument to the DiagHandler. </div></div></div></div></blockquote></span><div><br>Oh, right!<br></div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><pre style="color:rgb(0,0,0)"><pre>count in the condition, plus insert in the body of the if in TransformTypos::<span style="font-family:arial">TransformTypoExpr would be more efficient as a single insert-and-check (by avoiding two separate lookups into the set):
</span>
if (TypoExprs.insert(E).second) // means this element previously didn't exist in the set, and now does</pre></pre></div></blockquote></span><div>I've changed the count-then-insert to just an insert since SmallPtrSetImpl::insert returns true when the item wasn't already in the set.</div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><pre style="color:rgb(0,0,0)"><pre>This:
<pre> TransformCache.find(E) != TransformCache.end()</pre>might commonly be written as:
<pre> TransformCache.count(E)
Optionally with "!= 0", but often count is used directly as a boolean test, especially when it's a single set/map (not a multi set/map).</pre></pre></pre></div></blockquote></span><div>Changed. </div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><pre style="color:rgb(0,0,0)"><pre><pre></pre><pre>This comment didn't quite make sense to me:
<pre>+ // If corrections for the first TypoExpr have been exhausted, retry them
+ // against the next combination of substitutions for all of the other
+ // TypoExprs.
When you say "retry them" which them do you mean? The corrections for the first TypoExpr? But what does it mean for them to be exhausted then, if they can still be considered?
Maybe I need a bit of help with the higher level algorithm here - I assume the idea is:
Call transform.
As it visits TypoExpr nodes, it tries their first typo candidate available.
If we get to the end and have invalid code, reset the state of all the typo corrections except the first one (on its next query, it'll give back its second alternative).</pre></pre></pre></pre></div></blockquote></span><div>Here is where the confusion is at. As the transform visits each TypoExpr node it tries the first candidate available. If the first candidates of all the TypoExprs don't work, it then tries the next (and subsequent) candidates of only the first TypoExpr that was seen. When it exhausts the candidates from the first TypoExpr, it then resets the stream of candidates for the first TypoExpr and fetches the next candidate of the TypoExpr, continuing in this way until all combinations of candidates of all TypoExprs have been tried. </div></div></div></div></blockquote><div><br></div></span><div>So I'm still a bit confused about which parts of the retry logic are in which parts of the code.<br><br>OK, I think I've got it. Transform is called and we don't even know which TypoExprs are in the Expr. So we transform.<br><br>As transform runs it comes across TypoExprs, and attempts to transform them. For each one, it tries candidates until it finds a good one (based on the immediate context - what are the cases where BuildDeclarationNameExpr fails here? (when does NE.isInvalid fail?)). <br></div></div></div></div></blockquote><div><br></div></div></div><div>I think there is a bit of confusion on what a "retry" is. Within TransformTypoExpr, for either the first TypoExpr ever encountered by this TreeTransform or for a TypoExpr without a cached substitution, it tries to build a valid subexpression from the TypoCorrection candidates (e.g. a DeclRefExpr) and continues trying candidates until it finds one for which a valid subexpression can be formed, which it caches and returns. If none of the candidates worked, it will cache and return an invalid ExprResult. The TreeTransform then takes the returned subexpression and tries to rebuild the full expression (possibly trying to transform multiple TypoExprs along with CallExprs, etc).</div><div><br></div><div>I don't remember off the top of my head without looking at the body, but I think BuildDeclarationNameExpr will fail in cases where the correction is e.g. a type name, or a member reference name--basically, BuildDeclarationNameExpr returns an ExprResult and the correct behavior in principle is to check for an invalid result and try the next correction candidate when one is seen--even if the Consumer never returns a candidate for which BuildDeclarationNameExpr would fail.</div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br>So, in your example below, maybe the right answer is BEH, but AF is valid, with no correct candidate for the 3 choice. <br></div></div></div></div></blockquote><div><br></div></span><div>No, AF with no correction for the 3rd TypoExpr would result in the entire Expr result being invalid within the main Transform method.</div><span><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br>Consumers start at "before the first" element, calling "getNextCorrection" on a new consumer gives you the first correction.<br></div></div></div></div></blockquote><div><br></div></span><div>That is correct. And the "before the first" element is also an empty correction for use with getCurrentCorrection. ;)</div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br>On the first call to TransformExpr, the first call to TransformTypoExpr chooses A, records it, continues. <br>Gets to the second TransformTypoExpr, tries D and E, succeeds at F.<br></div></div></div></div></blockquote><div> </div></span><div>it would only try D then E then F within a single call to TransformTypoExpr if BuildDeclarationNameExpr failed for D and E.</div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>On the third TransformTypoExpr, it tries H and I, both fail, so it returns from the last line ("return TransformCache[E] = ExprError]).<br></div></div></div></div></blockquote><div> </div></span><div>Same case here--that last line would only be reached if H and I both failed within the call to BuildDeclarationNameExpr. </div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br>We get back to Transform with a failure. So we walk the TypoExprs (in CheckAndAdvanceTypoExprCorrectionStreams) we reset the 3rd consumer (because it was finished).<br></div></div></div></div></blockquote><div> </div></span><div>No, CheckAndAdvanceTypoExprCorrectionStreams would do nothing if the first TypoExpr is not finished. If the first TypoExpr is finished, then it would reset it and clear the cached result for the second TypoExpr (so that the next call to TransformTypoExpr on the second TypoExpr would try a new candidate instead of using the cached result). Only if both the first and second TypoExprs were *both* finished would the cached value of the third TypoExpr be cleared (and the candidate streams for both the first and second TypoExprs would be reset in addition to their cached results being removed from the cache). The only way CheckAndAdvanceTypoExprCorrectionStreams would see if the third TypoExpr was finished would be if both the first and second TypoExprs were also finished, and upon resetting it the tree transform would only be attempted again if there were at least four known TypoExprs (i.e. the number of known TypoExprs is greater than the number of TypoExprs that were finished and had to be reset by CheckAndAdvanceTypoExprCorrectionStreams).</div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br>We transform again:<br><br>First typo) We are at A so we call next and get B<br></div></div></div></div></blockquote><div><br></div></span><div>Yes. The first TypoExpr would always have it's next candidate tried by a call to TransformTypoExpr.</div><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Second typo) From F we continue and try G, fail, then we're finished for the second typo.<br></div></div></div></div></blockquote><div><br></div></span><div>No, G would only be tried if the result from transforming F was removed from the cache.</div><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Third typo) Perhaps we find some solution for this, or not, it doesn't matter.<br><br>We get back to Transform with a failure (second typo), So we walk the TypoExprs again, resetting the second (and maybe the third).<br></div></div></div></div></blockquote><div><br></div></span><div>If there are only three TypoExprs, then resetting the third TypoExpr would mean there are no valid combinations of corrections that yield a valid expression (where the final result of the transform is neither an invalid ExprResult nor caused any errors to be reported by the SFINAETrap). </div></div></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br>First typo) We are at B so we call next and get to C, let's say we fail here - C isn't even remotely valid.<br>Now it really matters whether we find/fail on the second and third typo. If we fail on the second and third... we're going to give up. We'll find that all the corrections are finished and we'll conclude that we've tried everything. <br><br>Am I following this right?<br></div></div></div></div></blockquote><div><br></div></span><div>Not quite. I think you are forgetting about the fact that, for all TypoExprs except the first, a cached result is used if it exists instead of fetching a new candidate from the Consumer and trying to build a valid Expr with it. </div></div></div></div></blockquote><div><br></div></div></div><div>Ah, looking at the code again I see what you mean. The 'break' is the bit that I missed in the reset/advance loop.<br><br>Which patches would I have to apply to ToT to be able to exercise this code? It might help for me to step through some of these scenarios in a debugger.<br></div></div></div></div></blockquote><div><br></div></div></div><div>To exercise this code, you'd need the full patch series (all 8) to exercise this code because it's not until patch #8 that anything actually generates TypoExprs in the AST. I can send you a combined patch rebased on ToT off-list if you'd like (it's ~112KB uncompressed), and the new test cases in test/SemaCXX/typo-correction-delayed.cpp explicitly test this functionality.</div></div></div></div></blockquote><div><br></div><div>That'd be great!</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class=""><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br>(I'm sort of wondering if this might be better expressed by some iterator scheme, but I'm not sure - there's no pretty next_combination algorithm (someone proposed one to boost at some point, it seems, but that's about as close as it's got to anything common/generic) anyway)</div><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>So for 3 TypoExprs... let's say 1 has the candidates (A, B, C), 2 has the candidates (D, E, F, G) and 3 has the candidates (H, I) the iterations would test the combinations in order:</div><div><br></div><div>A D H</div><div>B D H</div></div></div></div></blockquote><div><br></div></span><div>I suppose a simpler form of my above example is: if, on the first pass, we get ADH, on the next pass the code as-is would result in BCI, wouldn't it? It'll "next" each of the corrections.</div></div></div></div></blockquote><div><br></div></span><div>(I think you meant BEI, and C is only a candidate of the first TypoExpr, not the second one.) No it won't because of the caching. The second TypoExpr is only effectively "next"ed once all of the candidates for the first TypoExpr have been tried with the current candidate of the second TypoExpr, and the third TypoExpr would only be "next"ed if all of the candidates of both the first and second TypoExpr have been tried with the current candidate of the third TypoExpr.</div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br>And I'm still not quite understanding why there's the E != TypoExprs[0] condition... maybe that's part of what I'm missing. It looks like that code would cause, in a single Transform, multiple instances of the first typo to be computed with different/subsequent typo results... which seems confusing/strange. (though I don't fully understand how/where multiple instances of the same TypoExpr will be visited in a single Transform)</div></div></div></div></blockquote><div><br></div></span><div>That condition ensures that the cached result is never used for the first TypoExpr (instead of having an extra conditional for all of the return statements below here to not cache the result for the first TypoExpr).</div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div>(some nitpicks of the code - there's a duplicate lookup of TransformCache (count then [], or count then insert further down the algorithm - it should probably just be:<br><br>ExprResult &CacheEntry = TransformCache[E];<br>if (CacheEntry)<br> return CacheEntry;<br></div></div></div></div></blockquote><div><br></div></span><div>This wouldn't work because "TransformCache[E]" would add a cache entry for E, and either the "if (CacheEntry)" would always be true, or an ExprError() would be considered false and an ExprError cache entry would always be ignored, both of which would be bad. </div></div></div></div></blockquote><div><br></div></span><div>you're right that ExprResult isn't boolean testable - but it is possible to distinguish ExprError from a default-constructed ExprResult. A default constructed ExprResult is in the valid-but-unset state, so far as I can tell, so it would look like:<br><br>if (!CacheEntry.isUnset())<br> return CacheEntry<br><br>(which is, admittedly, not the /most/ legible thing ever)</div></div></div></div></blockquote><div><br></div></span><div>I'm worried this won't work, or at least would be (potentially) fragile, because .isUnset() is defined as "valid-but-null" and Sema sets valid-but-null as an explicit state in the return values of various functions--at least a few of which I saw while looking at users of the typo correction code.</div></div></div></div></blockquote><div><br></div><div>Ooh... I could totally see that happening (& thanks for the explanation).</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class=""><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Of course, a new local variable could be created containing the result of TransformCache.find(E) and then compared to TransformCache.end() and returning as appropriate, but I find it to be more verbose and less clear:</div><div><br></div><div>auto CacheEntry = TransformCache.find(E);</div><div>if (!TypoExprs.insert(E) && CacheEntry != TransformCache.end())</div><div> return CacheEntry->second;</div><div><br></div><div>And neither of the subsequent return statements would be changed because the cache entry would still need to be created...</div></div></div></div></blockquote><div><br></div></span><div>If you were using the iterator version, they could be hinted insertions which might be better - but I agree, if you're just using the iterators it's probably not terribly helpful. But as shown above, you can actually just use the op[] at the start, check !isUnset, then initialize the elements. This is a fairly common idiom in map related code, though more often it looks like </div></div></div></div></blockquote><div><br></div></span><div>After a bit more consideration, I've taken your suggested approach of using .isUnset() and added an assertion on trying to cache an "unset" value ("assert(!NE.isUnset() && ...);"). </div></div></div></div></blockquote><div><br></div><div>Sounds like a fair tradeoff - if the assertion does fire we can add a test case for it & switch back to something like your original formulation.<br><br>The other gotcha to be aware of is that since it's a reference into a DenseMap are invalidated by insertions - so if any of the work to initialize the TransformCache (the while nextCorrection loop, and probably most importantly the "BuildDeclarationNameExpr" call) can also modify the TransformCache (by transforming another typo expr) then the reference will be invalidated and the approach I'd suggested won't be applicable (you could still keep just one lookup in the cache hit case, but in the cache miss you'd have to do a second lookup after the work)</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div class="h5"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div><div><div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>...<br> return CacheEntry = NE;<br>}<br>return CacheEntry = ExprError();<br><br>)<br><br></div><div><div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>C D H</div><div>A E H</div><div>B E H</div><div>C E H</div><div>A F H</div><div>B F H</div><div>C F H</div><div>A G H</div><div>B G H</div><div>C G H</div><div><div>A D I</div><div>B D I</div><div>C D I</div><div>A E I</div><div>B E I</div><div>C E I</div><div>A F I</div><div>B F I</div><div>C F I</div><div>A G I</div><div>B G I</div><div>C G I</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><pre style="color:rgb(0,0,0)"><pre></pre></pre></div></blockquote></div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><pre style="color:rgb(0,0,0)"><pre><pre><pre>If we get to the end, have invalid code, and the first typo has no alternatives left to try - I guess we just want to print errors for everything, without typo corrections?</pre></pre></pre></pre></div></blockquote><br></span><div>If all of the combinations of typo corrections failed to create an acceptable Expr, then the current correction for all of TypoExprs would be an empty TypoCorrection which would cause the corresponding diagnostic handlers to print out errors without suggestions.</div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><pre style="color:rgb(0,0,0)"><pre><pre><pre>It might be helpful to split out some of these chunks (the state resetting code and maybe the diagnostic printing at the end) into separate functions to give them logical names & make the higher level algorithm more clear?</pre></pre></pre></pre></div></blockquote></span><div>I've split the diagnostics emission and the state handling into separate helpers, and expanded their description comments. I'm not keen on the name for the state handler but it is the best name I could come up with that was even remotely descriptive.<br></div></div></div></div></blockquote><div><br></div></div></div><div>Roughly works, though I'm still quite confused, it seems. (alternative name for that function would be "reset" rather than "advance" - but once I understand the algorithm better I might have better suggestions) </div></div></div></div></blockquote><div><br></div></div></div><div>The function removes the cached result for the first TypoExpr then checks the TypoExpr's correction candidate stream; if it has reached the end of the stream, it will reset the stream then perform the same check for the second TypoExpr, continuing until it either has reset the correction streams for all of the TypoExprs or has found a TypoExpr that hadn't reached the end of its correction stream. And in the former case, it would signal that all combinations of corrections had been retried and the TreeTransform should stop searching for a valid combination of corrections. And now that I think about it, the "E != TypoExprs[0]" is probably now a superfluous check since calling CheckAndAdvanceTypoExprCorrectionStreams would always remove the first TypoExpr from the result cache.</div><span><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><pre style="color:rgb(0,0,0)"><pre><pre><pre>The loop at the end ("<span style="font-family:arial">for (auto E : TypoExprs) {") is over a hashing data structure, right? Is that going to cause unstable output of diagnostics? (eg: different machines, different memory layout, the same program might get the typos printed in a different order) or is there something else that will fix that?</span></pre></pre></pre></pre></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><pre style="color:rgb(0,0,0)"><pre><pre><pre><span style="font-family:arial">
If you have to change the set to be a SetVector, or similar (to stabilize output) - then you won't need the FirstTypoExpr anymore - it'll just be the first one in the set, since you iterate in insertion order at that point.</span></pre></pre></pre></pre></div></blockquote></span><div>Switched it from a SmallPtrSet to a SmallSetVector like it should have been. Doing so also allowed me to simplify the state handling code by being able to assume a deterministic order, and eliminating FirstTypoExpr. </div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><pre style="color:rgb(0,0,0)"><pre><pre><pre><span style="font-family:arial">
"</span><span style="font-family:arial">while (TypoCorrection TC = State.Consumer->getNextCorrection()) {" doesn't seem to be a loop - it has an unconditional return at the end of the body. Should it just be an 'if'?</span></pre></pre></pre></pre></div></blockquote></span><div>There was actually a missing check on whether the result of Sema::BuildDeclarationNameExpr was valid.</div></div></div></div></blockquote><div><br></div></span><div>Were there missing/added test cases to cover this situation? <br></div></div></div></div></blockquote><div><br></div></span><div>No because at this point in the patch series, there is nothing that actually generates TypoExprs. </div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><pre style="color:rgb(0,0,0)"></pre></div><div><div><div class="gmail_extra">"<span style="color:rgb(0,0,0)">+ (FullExpr.get()->isTypeDependent() ||</span><pre style="color:rgb(0,0,0)">+ FullExpr.get()->isValueDependent() ||
+ FullExpr.get()->isInstantiationDependent())) {"
What are those checks for? I don't know what condition we would be in if we had typo corrections but it wasn't type, value, or instantiation dependent. Perhaps a comment could explain that?</pre></div></div></div></blockquote></span><div>The check is to avoid running the tree transform on Expr trees that are known to not have a TypoExpr in them, regardless of whether the current expression evaluation context still indicates there are uncorrected typos. I've added a comment to that effect. </div></div></div></div></blockquote><div><br></div></span><div>Hmm - I suppose my question might then be: When would we have NumTypos > 0 but be in a situation where the full expression cannot have any typos in it?</div></div></div></div></blockquote><div><br></div></span><div>I don't remember off the top of my head because it's been a few months now since I wrote the original patches... but basically any situation where more than one FullExpr shares an expression evaluation context or when the current expression evaluation context had its NumTypos increased by an expression evaluation context that was popped off the stack. And TreeTransforms aren't cheap enough to blindly run on every Expr when NumTypos is non-zero. </div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div><div><div class="gmail_extra"><pre style="color:rgb(0,0,0)">To simplify <span style="font-family:arial">Sema::clearDelayedTypo you could add MapVector::erase(const KeyType&) (given that std::map has such an operation, it seems like a reasonable one for MapVector to have).</span></pre></div></div></div></blockquote></span><div>Done (simple patch attached).</div></div></div></div></blockquote></span><div><br>Great - if you'd like to commit this separately, it should have a unit test. Otherwise it's probably simple enough to commit along with your new usage if you prefer.<br></div></div></div></div></blockquote><div><br></div></span><div>Committed with unit test as r219240. </div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div><div><div class="gmail_extra"><span style="color:rgb(0,0,0)">Sema::getTypoExprState looks like it could be a const function (it looks like it's never meant to be called when the element isn't already in the map) - using 'lookup' instead of operator[] will be a const operation on the map (& assert if the element doesn't exist in the map).<br></span></div></div></div></blockquote><div><br></div></span><div>Sadly I cannot do that because the operator[] cannot be used if the method is made const, and .lookup() can't be used because it returns by-value instead of by-reference and TypoExprState is not copyable:</div><div><br></div><div><div>../tools/clang/lib/Sema/SemaLookup.cpp:4600:10: warning: returning reference to local temporary object [-Wreturn-stack-address]</div><div> return DelayedTypos.lookup(TE);</div><div><br></div><div>../include/llvm/ADT/MapVector.h:88:30: error: call to implicitly-deleted copy constructor of 'const clang::Sema::TypoExprState'<br></div></div><div><div> return Pos == Map.end()? ValueT() : Vector[Pos->second].second;</div></div></div></div></div></blockquote><div><br></div></span><div>Oh, weird - my mistake in assuming/misremembering what lookup actually does. I'd still make getTypoExprState const and use find+assert just to constrain the interface a bit more rather than leaving it possible to accidentally create a new entry in the map when the API is only intended to query existing entries.</div></div></div></div></blockquote><div><br></div></span><div>Done. The new changes I've made overall are:</div><div><br></div><div><div>diff --git a/include/clang/Sema/Sema.h b/include/clang/Sema/Sema.h<br></div><div>index b30bf5e..f644fdd 100644</div><div>--- a/include/clang/Sema/Sema.h</div><div>+++ b/include/clang/Sema/Sema.h</div><div>@@ -2628,7 +2628,7 @@ private:</div><div> bool ErrorRecovery, bool &IsUnqualifiedLookup);</div><div> </div><div> public:</div><div>- const TypoExprState &getTypoExprState(TypoExpr *TE);</div><div>+ const TypoExprState &getTypoExprState(TypoExpr *TE) const;</div><div> </div><div> /// \brief Clears the state of the given TypoExpr.</div><div> void clearDelayedTypo(TypoExpr *TE);</div><div>diff --git a/lib/Sema/SemaExprCXX.cpp b/lib/Sema/SemaExprCXX.cpp</div><div>index 73aa997..1f08276 100644</div><div>--- a/lib/Sema/SemaExprCXX.cpp</div><div>+++ b/lib/Sema/SemaExprCXX.cpp</div><div>@@ -6028,7 +6028,7 @@ public:</div><div> // If the TypoExpr hasn't been seen before, record it. Otherwise, return the</div><div> // cached transformation result if there is one and the TypoExpr isn't the</div><div> // first one that was encountered.</div><div>- if (!TypoExprs.insert(E) && E != TypoExprs[0] && TransformCache.count(E)) {</div><div>+ if (!TypoExprs.insert(E) && TransformCache.count(E)) {</div><div> return TransformCache[E];</div><div> }</div><div> </div><div>diff --git a/lib/Sema/SemaLookup.cpp b/lib/Sema/SemaLookup.cpp</div><div>index 8ddb5fd..bb6c76a 100644</div><div>--- a/lib/Sema/SemaLookup.cpp</div><div>+++ b/lib/Sema/SemaLookup.cpp</div><div>@@ -4601,8 +4601,11 @@ TypoExpr *Sema::createDelayedTypo(std::unique_ptr<TypoCorrectionConsumer> TCC,</div><div> return TE;</div><div> }</div><div> </div><div>-const Sema::TypoExprState &Sema::getTypoExprState(TypoExpr *TE) {</div><div>- return DelayedTypos[TE];</div><div>+const Sema::TypoExprState &Sema::getTypoExprState(TypoExpr *TE) const {</div><div>+ auto Entry = DelayedTypos.find(TE);</div><div>+ assert(Entry != DelayedTypos.end() &&</div><div>+ "Failed to get the state for a TypoExpr!");</div><div>+ return Entry->second;</div><div> }</div><div> </div><div> void Sema::clearDelayedTypo(TypoExpr *TE) {</div></div><div><div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div><div><div class="gmail_extra"><span style="color:rgb(0,0,0)"><br>Otherwise this all looks like rather neat stuff - though I'm not as able to review in detail some of the refactoring of existing typo correction stuff. It looks plausible.</span><span style="color:black"> </span></div></div></div></blockquote></span><div><br>I've attached the updated patch. Thanks for reviewing this, David!<br></div><span><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div><div><div class="gmail_extra"><span style="color:black"> </span></div></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div><div><div class="gmail_extra"><span style="color:black"> </span></div></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div><div><div class="gmail_extra"><div class="gmail_quote">On Thu, Sep 25, 2014 at 2:00 PM, Kaelyn Takata <span dir="ltr"><<a href="mailto:rikka@google.com" target="_blank">rikka@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">ping</div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Aug 26, 2014 at 11:05 AM, Kaelyn Takata <span dir="ltr"><<a href="mailto:rikka@google.com" target="_blank">rikka@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><br>+dblaikie</div><div><div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Jul 31, 2014 at 1:20 PM, Kaelyn Takata <span dir="ltr"><<a href="mailto:rikka@google.com" target="_blank">rikka@google.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br>
Part of the infrastructure is a map from a TypoExpr to the Sema-specific<br>
state needed to correct it, along with helpers to ease dealing with the<br>
state.<br>
<br>
The the typo count is propagated up the stack of<br>
ExpressionEvaluationContextRecords when one is popped off of to<br>
avoid accidentally dropping TypoExprs on the floor. For example,<br>
the attempted correction of g() in test/CXX/class/class.mem/p5-0x.cpp<br>
happens with an ExpressionEvaluationContextRecord that is popped off<br>
the stack prior to ActOnFinishFullExpr being called and the tree<br>
transform for TypoExprs being run.<br>
---<br>
include/clang/Sema/Sema.h | 44 +++++<br>
include/clang/Sema/SemaInternal.h | 15 +-<br>
include/clang/Sema/TypoCorrection.h | 2 +-<br>
lib/Sema/SemaExpr.cpp | 7 +<br>
lib/Sema/SemaExprCXX.cpp | 108 ++++++++++++<br>
lib/Sema/SemaLookup.cpp | 316 ++++++++++++++++++++++++------------<br>
6 files changed, 384 insertions(+), 108 deletions(-)<br>
<br>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></span></div><br></div></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div><br></div></div>