<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jun 24, 2015, at 12:12 PM, Alexey Samsonov <<a href="mailto:vonosmas@gmail.com" class="">vonosmas@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 14px; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">Hi Adrian,<div class=""><br class=""></div><div class="">You might want to take a look at abandoned <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D2658&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=M2Zg7H-gbv_tNUdRj7CBGjONnJqWE6I4H4MPu8YCxS8&s=7reEDqWf8_HRQoLhVbC4yfuU3vREoye6EczLjLAb_dU&e=" class="">http://reviews.llvm.org/D2658</a>, where I tried to implement something similar.</div><div class="">Looks like we're now at the point where we *do* require complicated solution and analysis in DbgValueHistoryCalculator...<br class=""><div class="gmail_extra"><br class=""><div class="gmail_quote">On Tue, Jun 23, 2015 at 4:41 PM, Adrian Prantl<span class="Apple-converted-space"> </span><span dir="ltr" class=""><<a href="mailto:aprantl@apple.com" target="_blank" class="">aprantl@apple.com</a>></span><span class="Apple-converted-space"> </span>wrote:<br class=""><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">Here is a proposal for improving DbgValueHistoryCalculator and the<br class="">overall quality of debug locations.<br class=""><br class="">Focus: This is about lowering the DBG_VALUE machine instructions to<br class="">DWARF location lists.<br class=""><br class="">Non-focus: This is not about (typical -O0) variables that permanently<br class="">reside at a frame index and are described with dbg.declare intrinsics<br class="">in the IR. These variables are stored in the MMI side-table and<br class="">processed separately.<br class=""><br class=""><br class="">The semantics of DBG_VALUE<br class="">==========================<br class=""><br class="">Examples:<br class=""><br class=""> <span class="Apple-converted-space"> </span>DBG_VALUE %EAX, %noreg, !"a", <<0x7fd4bbc17470>>; line no:3<br class=""> <span class="Apple-converted-space"> </span>DBG_VALUE %RBX, 56, !"a", <<0x7ffd5ac3b3c0>>; line no:4 indirect<br class=""> <span class="Apple-converted-space"> </span>DBG_VALUE 0, 0, !"a", <<0x7fd4bbc17470>>; line no:3<br class=""><br class="">The DBG_VALUE machine instruction informs us that a variable (or a<br class="">piece of it) is henceforth stored in a specific location or has a<br class="">constant value. This is valid until the next DBG_VALUE instruction<br class="">describing the same variable or the location is being clobbered.<br class=""><br class="">Lowering today<br class="">==============<br class=""><br class="">1. DbgValueHistoryCalculator takes the MachineFunction graph and<br class=""> produces a list of live ranges for each variable (the<br class=""> DbgValueHistoryMap).<br class=""><br class=""> Note: Variable live ranges are to be consumed by a debugger. They<br class=""> refer to the physical addresses and are agnostic of control flow.<br class=""><br class=""> The live ranges produced by DbgValueHistoryCalculator are pairs of<br class=""> MachineInstructions, the first of which is always the DBG_VALUE<br class=""> describing the variable and location.<br class=""><br class=""> The ranges are calculated as follows:<br class=""> - If the location is a register the range extends until the<br class=""> register is clobbered or until the end of the basic block.<br class=""><br class=""> - If the location is a constant the range extends until the next<br class=""> DBG_VALUE instruction describing the same variable.<br class=""><br class=""> - If the location is indirect and the register is not clobbered<br class=""> outside the function prologue and epilogue the the range is the<br class=""> entire function. This is a heuristic to make stack<br class=""> frame-allocated variables work better. Otherwise the range<br class=""> extends until the next DBG_VALUE or the end of the basic block.<br class=""><br class="">2. The buildLocationList() function takes the list of ranges of one<br class=""> variable and builds a location list. A variable may consist of many<br class=""> pieces which have their own live ranges. A live ranges for a piece<br class=""> is split up so that no piece's live range starts or ends in the<br class=""> middle of another piece's live range. Any two consecutive ranges<br class=""> with identical location contents are merged. Labels are requested<br class=""> for start and end of each range.<br class=""><br class="">3. The location list is finalized by lowering it into DWARF and<br class=""> emitted into a buffer.<br class=""><br class="">Shortcomings with the current apporach<br class="">======================================<br class=""><br class="">Problems with the current approach include inaccurate live ranges for<br class="">constant values, ranges ending too early (usually at basic block<br class="">boundaries), poor handling of frame-register-indirect variables that<br class="">were introduced by spill code, poor handling of variables that are in<br class="">more than one location at once.<br class=""><br class=""><br class=""><br class="">A better DbgValueHistoryCalculator<br class="">==================================<br class=""><br class="">Currently DbgValueHistoryCalculator does a very simple, linear pass<br class="">through all MachineInstructions, with a couple of heuristics to make<br class="">common cases such as the frame index variables work.<br class=""><br class="">It would be possible to address many of the current problems by having<br class="">earlier passes emit many more DBG_VALUE instructions. There are<br class="">several problem with this approach though: It distributes the<br class="">complexity across many more passes and thus imposes a maintenance<br class="">burden on authors of prior passes, increases machine code size, and<br class="">stores a lot of redundant information.<br class=""><br class="">What I'm proposing instead is to make DbgValueHistoryCalculator<br class="">smarter at creating ranges. The goal is to make hack^H^H^Heuristics such as<br class="">the frame index handling unnecessary by correctly propagating DBG_VALUE<br class="">liveness across basic block boundaries.<br class=""><br class="">To illustrate what I mean I'll use a notation where<br class="">DbgValueHistoryCalculator is defined in terms of a data-flow analysis.<br class=""><br class="">First a couple of data types used in the following pseudo code:<br class="">- A range is a pair of MachineInstructions (start, end) that both belong<br class=""> <span class="Apple-converted-space"> </span>to the same basic block.<br class="">- range[var] is the final list of live ranges for a variable.<br class=""></blockquote><div class=""><br class=""></div><div class="">How come you never remove ranges from range[var]? What if you have smth. like</div></div></div></div></div></div></blockquote><blockquote type="cite" class=""><div class=""><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 14px; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class=""><div class="gmail_extra"><div class="gmail_quote"><div class="">BB1:</div><div class=""> DBG_VALUE %RAX, %noreg, !"a"</div><div class=""> jmp BB2</div><div class="">BB2:</div><div class=""> // more minsns</div><div class=""> clobber %RAX</div><div class="">BB3:</div><div class=""> clobber %RAX</div><div class=""> DBG_VALUE %RBX, %noreg, !"a"</div><div class=""> jmp BB2</div></div></div></div></div></div></blockquote><blockquote type="cite" class=""><div class=""><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 14px; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""> </div><div class="">Looks like you would handle BB1, then BB2, and create a range for "a" from the beginning of BB2 to clobber instruction.</div><div class="">But this might be incorrect - if we have jumped to BB2 from BB3, then the location of "a" is anything but %RAX.</div></div></div></div></div></div></blockquote><div><br class=""></div><div>To guarantee termination new ranges are only committed to ranges[var] when we know they are valid on all paths.</div></div><div class="">BB1 is visited first and we record a range for a from the DBG_VALUE to the end of BB1:</div><div class=""> ranges[a].push_back( (DBG_VALUE, BB2.first_insn) )</div><div class="">(There’s a bug in the pseudo-code doesn’t set the end of the range correctly, but the comment conveys the intention.)</div><div class="">When BB2 is visited the data-flow analysis magic kicks in via the join() function, which I (as Paul pointed out) left out from the pseudo-code. Sorry for the extra confusion. Next time I write up such a proposal I should just implement it so I can make sure that all the pieces are actually there and I don’t just imagine them :-)</div><div class="">The first time BB2 is visited, we only have the information from BB1, so join(BB2, /*pred1*/ BB1, /*pred2*/ BB3) returns an empty set. After BB3 is visited join() will still come back with an empty set because the intersection of the location for “a” coming from BB1 and BB3 is different.</div><div class=""><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 14px; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""><br class=""></div><div class="">I think you should respect the control flow somehow, and consider locations of "a" at the beginning of basic block BB only</div><div class="">if you are certain that locations of "a" at the end of all BB's predecessors don't contradict.</div></div></div></div></div></div></blockquote><div><br class=""></div><div>It does take the control flow into account, but I left out the important part from the pseudo-code and replaced it with a hand-wavy statement about it being a data-flow analysis. Sorry about that!</div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 14px; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""><br class=""></div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">- open_ranges[BB][var] is the list of not-yet-terminated ranges for<br class=""></blockquote><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"> <span class="Apple-converted-space"> </span>var inside the current basic block BB.<br class="">- outgoing_loc[BB][var] is the list of locations for var valid at<br class=""> <span class="Apple-converted-space"> </span>the end of a basic block BB.<br class=""><br class="">// This code is probably buggy because I didn't run it through a<br class="">// compiler yet, but I hope it serves to illustrate my point.<br class=""><br class="">// Visit a machine instruction.<br class=""></blockquote><div class=""><br class=""></div><div class="">I'm kind of concerned about the runtime cost of this. Looks like this can be at least</div><div class="">O(number_of_minsn * number_of_local_variables), which is already significant. And it seems</div><div class="">that you can iterate over the same basic block several times.</div><div class=""><br class=""></div></div></div></div></div></div></blockquote><div class=""><br class=""></div><div class="">Yes, we’ll have to benchmark it very carefully before deciding that the extra complexity is worth it.</div><div class=""><br class=""></div><div class="">thanks for all the feedback</div><div class="">-- adrian</div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 14px; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">transfer(MInsn) {<br class=""> <span class="Apple-converted-space"> </span>// A DBG_VALUE marks the beginning of a new range.<br class=""> <span class="Apple-converted-space"> </span>if (MInsn is a DBG_VALUE(var, loc)) {<br class=""><br class=""> <span class="Apple-converted-space"> </span>for ((rvar, start) : open_ranges[BB])<br class=""> <span class="Apple-converted-space"> </span>// A DBG_VALUE terminates a range started by a previous<br class=""> <span class="Apple-converted-space"> </span>// DBG_VALUE for the same variable, if the described pieces<br class=""> <span class="Apple-converted-space"> </span>// overlap.<br class=""> <span class="Apple-converted-space"> </span>if (var == rvar && piece_overlaps(MInsn, start)) {<br class=""> <span class="Apple-converted-space"> </span>ranges[var].push_back((start, MInsn));<br class=""> <span class="Apple-converted-space"> </span>open_ranges[BB][rvar].remove(start));<br class=""> <span class="Apple-converted-space"> </span>}<br class=""> <span class="Apple-converted-space"> </span>open_ranges[BB][var].push_back(MInsn)<br class=""> <span class="Apple-converted-space"> </span>}<br class=""><br class=""> <span class="Apple-converted-space"> </span>// A def of a register may mark the end of a range.<br class=""> <span class="Apple-converted-space"> </span>if (MInsn is a def(reg)) {<br class=""> <span class="Apple-converted-space"> </span>for ((var, start) : open_ranges[BB])<br class=""> <span class="Apple-converted-space"> </span>if (start.loc == reg) {<br class=""> <span class="Apple-converted-space"> </span>ranges[var].push_back((start, MInsn));<br class=""> <span class="Apple-converted-space"> </span>open_ranges[BB][var].remove(start));<br class=""> <span class="Apple-converted-space"> </span>}<br class=""><br class=""> <span class="Apple-converted-space"> </span>// End all ranges in the current basic block.<br class=""> <span class="Apple-converted-space"> </span>if (MInsn.isTerminator())<br class=""> <span class="Apple-converted-space"> </span>for (range : open_ranges[BB]) {<br class=""> <span class="Apple-converted-space"> </span>ranges[var].push_back(range);<br class=""> <span class="Apple-converted-space"> </span>open_ranges[BB].remove(range);<br class=""> <span class="Apple-converted-space"> </span>outgoing_loc[BB][range.var].push_back(range.loc);<br class=""> <span class="Apple-converted-space"> </span>changed = true;<br class=""> }<br class="">}<br class=""><br class="">// Visit the beginning of a basic block BB with two predecessors.<br class="">join(BB, BB1, BB2) {<br class=""> <span class="Apple-converted-space"> </span>for ((var1, loc1) : outgoing_loc[BB1])<br class=""> <span class="Apple-converted-space"> </span>for (loc2 : outgoing_loc[BB2][var1])<br class=""> <span class="Apple-converted-space"> </span>// It's only safe to propagate a range if all predecessors end<br class=""> <span class="Apple-converted-space"> </span>// with the same location.<br class=""> <span class="Apple-converted-space"> </span>if (loc1.kind == loc2.kind &&<br class=""> <span class="Apple-converted-space"> </span>loc1.val == loc2.val)<br class=""> <span class="Apple-converted-space"> </span>// Not sure how to best implement this.<br class=""> <span class="Apple-converted-space"> </span>open_ranges[BB][var].push_back(new DBG_VALUE(loc1));<br class=""></blockquote><div class=""><br class=""></div><div class="">Yep, that's kind of problem, we can't modify MachineFunction at this point.</div><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">}<br class=""><br class="">analyze(MF) {<br class=""> <span class="Apple-converted-space"> </span>for (BB : MF)<br class=""> <span class="Apple-converted-space"> </span>workset.push_back(BB);<br class=""><br class=""> <span class="Apple-converted-space"> </span>while (!workset.empty()) {<br class=""> <span class="Apple-converted-space"> </span>changed = false;<br class=""> <span class="Apple-converted-space"> </span>for (MI : workset.pop_front())<br class=""> <span class="Apple-converted-space"> </span>transfer(MI)<br class=""> <span class="Apple-converted-space"> </span>if (changed)<br class=""> <span class="Apple-converted-space"> </span>workset.append(BB.successors())<br class=""> }<br class="">}<br class=""><br class="">A couple of observations about the pseudo-code above: The analysis<br class="">terminates, because ranges is write-only, making this effectively a<br class="">bit-vector problem. It is also safe, because we are only propagating<br class="">ranges into the next basic block if all predecessors have the same<br class="">outgoing location for a variable piece.<br class=""><br class="">One problem that is not addressed by the approach above is how to<br class="">become better at handling variables that are in more than one location<br class="">at once: As Keno noted on llvm-commits recently, the fundamental<br class="">problem is that DbgValueHistoryCalculator cannot safely distinguish<br class="">between a DBG_VALUE that provides an additional valid location and one<br class="">describing an updated location for a variable. IMO this is best<br class="">addressed on a case-by-case basis in the pass that introduces the second<br class="">DBG_VALUE, either by marking it as alternative location or not emitting<br class="">it at all, but I’m open for suggestions.<br class=""><br class="">Comments?<br class=""><br class="">-- adrian<br class=""><br class=""><br class="">_______________________________________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:LLVMdev@cs.uiuc.edu" class="">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu/" rel="noreferrer" target="_blank" class="">http://llvm.cs.uiuc.edu</a><br class=""><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" rel="noreferrer" target="_blank" class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br class=""></blockquote></div><br class=""><br clear="all" class=""><div class=""><br class=""></div>--<span class="Apple-converted-space"> </span><br class=""><div class="gmail_signature"><div dir="ltr" class="">Alexey Samsonov<br class=""><a href="mailto:vonosmas@gmail.com" target="_blank" class="">vonosmas@gmail.com</a></div></div></div></div></div></div></blockquote></div><br class=""></body></html>