[PATCH] D140373: [WebAssembly][LiveDebugValues] Handle target index defs

Sat Dec 24 06:12:18 PST 2022

jmorse added a comment.

Thanks for assurance about the stack-popping being addressed elsewhere -- it might be worth putting a comment in D138943 <https://reviews.llvm.org/D138943> next to the definition of struct WasmLoc about that being a deliberate non-objective, to enlighten future readers.

The LiveDebugValues bits look fine -- it's quite a small diff,

> And another thing, I didn't add anything to copy handling because Wasm doesn't have copy instructions. (This COPY instruction is a pseudo instruction and should have been lowered into other instructions at this phase.)

If there's no movement of values with COPYs / spills / restores in webassembly then that's great,  no extra logic is needed. Having to follow multiple machine locations around is an artefact of the limited space available on physical machines.

A note on the performance characteristics -- the set of "live" variable locations that are being tracked are kept in a sparse bitvector, one bit per {Variable,MachineLocation} pair. Some large projects can have hundreds (or thousands) of variables for each machine location, which creates a lot of overhead when updating the sparse bitvector. So, a couple of years ago the sparse bitvector became a coalescing bitvector that groups bits together, and the numerical space of the locations were grouped by machine location to make them coalesce better, improving compile-time performance. There's more explanation at D74986 <https://reviews.llvm.org/D74986> and that stack.

As you're placing all target index locations in the `kWasmLocation` location, for large projects there's a risk that the tracking of variable locations won't compress very well, because the bit numbers given for them are randomly spread between variables and target indexes. If you gave each target index it's own u32_location_t, they would group together better and compress more.

This isn't actually anything to care about unless you're dealing with absolutely massive functions with lots of inlining, but I figured I'd explain in case you see poor compile-time performance in this area in the future.

> When there's a register def, this pass doesn't insert DBG_VALUE $noreg right after it to terminate its impact; this just deletes its info from OpenRanges so that it doesn't get propagated to its OutLocs. I did the same for our Wasm target indices, because it looks that's the way this pass works. But do you know why this pass was designed that way? I think adding DBG_VALUE $noreg right after the new def would make debug info more precise within the BB.

That's correct -- as it stands, there's a certain amount of redundancy because the `DbgEntityHistoryCalculator`class has to re-compute that information, even though LiveDebugValues already had to do it. I don't really know how things ended up being this way, my best guess is that other parts of LLVM used to propagate variable locations between blocks and did so incorrectly, and `DbgEntityHistoryCalculator` was needed to find out where variable ranges ended. I think LiveDebugValues was added in 2015, to more accurately propagate variable locations, but `DbgEntityHistoryCalculator` wasn't removed or re-designed as it was still doing useful things (i.e., building location lists), and there was no reason to change it. So, I think the redundancy is just a case of technical debt.

A more ideal design would be one where LiveDebugValues was an analysis rather than a pass, and it directly produced variable location ranges instead of DBG_VALUE instructions. In functions with lots of inlining and lots of blocks, LiveDebugValues effectively produces the cross product of all locations for all variables in all blocks in DBG_VALUEs, which can be a massive number of instructions, and is really inefficient! I tried a while back to convert it to an analysis, but doing that changed the line tables for some reason that I couldn't pin down X_X...

================
Comment at: llvm/include/llvm/CodeGen/TargetInstrInfo.h:2064-2065

+  /// Returns true if the given \p MI defines a target index operand. If so,
+  /// sets \p Index and \p Offset of the target index operand.
+  virtual bool isTargetIndexDef(const MachineInstr &MI, int &Index,
----------------
Nit: could I suggest writing "target index" as "TargetIndex" to make it clear it's a noun, and an operand kind. That makes it very clear to readers that it's a specific data structure rather than some higher level concept applicable to all targets.

================
Comment at: llvm/test/DebugInfo/WebAssembly/live-debug-values.mir:199
+# index 3 is killed in bb.2 because there is 'local.set 3' there. As a result,
+# bb.3 shoudln't have any additional DBG_VALUEs added for those locals.
+# CHECK-LABEL: name: test_target_index_defs
----------------

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140373/new/

https://reviews.llvm.org/D140373