<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Thanks for your comments. Replies inline.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The DWARF 5 standard says that<br>
"Address range entries in a range list may not overlap.”<br>
<br>
The reasoning behind this is presumably that if a variable is in more than one<br>
location at a point all the values need to be identical, or the information is useless</blockquote><div><br></div><div>Oh huh, for some reason I was under the impression that they could. No matter, all we would have to do then is choose one in the backend. I think it makes sense to maintain the notion of separate multiple locations until then.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
><br>
> - To add a location with the same value for the same variable, you pass the<br>
> token of the FIRST llvm.dbg.value, as this llvm.dbg.value's first argument<br>
> E.g. to add another location for the variable above:<br>
><br>
> %second = call token @llvm.dbg.value(token %first, metadata %val2,<br>
> metadata !var, metadata !expr2)<br>
<br>
</span>Does this invalidate the first location, or does this add an additional location<br>
to the set of locations for var at this point? If I want to add a third location,<br>
which token do I pass in? Can you explain a bit more what information the token<br>
allows us to express that is currently not possible?<span class=""><br></span></blockquote><div><br></div><div>It adds a second location. If you want to add a third location you pass in the first token again.</div><div>Thus the first call (key call) indicates a change of values, and all locations that have the same value should use the key call's token.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
><br>
> - To indicate that a location will no longer hold a value, you do the<br>
> following:<br>
><br>
> call token @llvm.dbg.value(token %second, metadata token undef,<br>
> metadata !var, metadata !())<br>
><br>
> - The current set of locations for a variable at a given instruction are all<br>
> those llvm.dbg.value instructions that dominate this location (<br>
> equivalently all those llvm.dbg.value calls whose token you could use at<br>
> that location without upsetting the Verifier), except that if more than<br>
> one key call is dominating, only the most recent one and all calls<br>
> associated to it by first argument count.<br>
><br>
> I think that should encapsulate the semantics, but here are some consequences<br>
> of and comments on the above that I think would be useful to discuss:<br>
><br>
> - The upgrade path for existing IR is very simple and just consists of<br>
> adding token undef as the first argument to any call in the IR.<br>
><br>
> - In general, if a value gets removed by an optimization, the corresponding<br>
> llvm.dbg.value call can be removed, unless that call is a key call, in<br>
> which case the value should be undefed out. This is necessary both to be<br>
> able to keep it around as the first argument to the other calls, and more<br>
> importantly to mark the end point of a previous set of locations.<br>
<br>
</span>So if %val is optimized out in the following example:<br>
<span class=""><br>
%first = call token @llvm.dbg.value(token undef, metadata %val,<br>
metadata !var, metadata !expr)<br>
</span> ...<br>
<span class=""> %second = call token @llvm.dbg.value(token %first, metadata %val2,<br>
metadata !var, metadata !expr2)<br>
<br>
</span>Does this turns into:<br>
<br>
call token @llvm.dbg.value(token undef, metadata %undef,<br>
metadata !var, metadata !expr)<br>
%second = call token @llvm.dbg.value(token %undef, metadata %val2,<br>
metadata !var, metadata !expr2)<br>
<br>
Or do we still have a %first token, or does the key call get removed entirely, because<br>
the second one is now a key call?<br></blockquote><div><br></div><div>I think the situation is the following:</div><div>If %second is the only use of %first, we can do that optimization. If not and %second dominates all uses of first, we could also do this optimization and replace all uses of %first with %second. However, we cannot remove the actual first key call, because it denotes the end location for the previous value of the same variable. Two exceptions I could think of are if %first is the first call for that variable in the function (as then there can not be a previous range to terminate) or if there are no other calls or memory operations in between %first and %second, in which case we could hoist %second up and merge the two calls. Does that make sense?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5">
><br>
> - I think llvm.dbg.declare can be deprecated and it's uses replaced by<br>
> llvm.dbg.value with an DW_OP_deref. That would also clarify the semantics<br>
> of the operation which have caused some confusion in the past.<br>
<br>
</div></div>I think we could already remove it today without any loss of generality (by<br>
lifting any dbg.value whose first argument is an alloca into the MMI table).<br>
What I see this proposal adding is a way to mark the end of a range, which<br>
is important when a value is on the stack only for part of the function (as<br>
in the stack coloring example).</blockquote><div><br></div><div>Agreed! </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
><br>
> - We may want to add an extra pass that does debug info inference (some of<br>
> which is done in InstCombine right now)<br>
<br>
</span>What kind of inference does InstCombine do currently?</blockquote><div><br></div><div>I was thinking of replacing llvm.dbg.declare by appropriate llvm.dbg.value at each load/store.</div><div>In the new design that would essentially be an inference pass which would add those as</div><div>locations, with the original one only removed if the alloca actually gets lifted into registers.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">
><br>
> Here are some of the invariants, the verifier would enforce (included in the<br>
> hope that they can clarify anything in the above):<br>
><br>
> 1. If the first argument is not token undef, then<br>
> a. If the second argument is not token undef,<br>
> I. the first argument must be a call to llvm.dbg.value whose first<br>
> argument is token undef<br>
> b. If the second argument is token undef<br>
> II. the first argument must be a call to llvm.dbg.value whose second<br>
> argument is not token undef<br>
> III. the expression argument must be empty<br>
> c. In either case, the variable described must be the same as the one<br>
> described by the call that is the first argument.<br>
> d. There may not be another call to llvm.dbg.value with token undef<br>
> that dominates this instruction, is not the one passed as the first<br>
> argument and is dominated by the one passed as the first argument.<br>
> 2. All other invariants regarding calls to llvm.dbg.value carry over<br>
> unchanged<br>
><br>
<br>
<br>
</div></div><span class="HOEnZb"><font color="#888888">-- adrian</font></span></blockquote></div><br></div></div>