[Lldb-commits] [PATCH] D105741: [trace] Add `thread trace dump ctf -f <filename>` command

Wed Jul 14 16:57:56 PDT 2021

jj10306 marked 6 inline comments as done.
jj10306 added inline comments.

================
Comment at: lldb/include/lldb/Target/TraceHTR.h:3
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
----------------
clayborg wrote:
> Maybe add a definition here of what "HTR" stands for and a bit of background on what classes are in this file here or below.
Working on comprehensive HTR documentation that I plan to store in `docs/` - will update the documentation in this file to mention where the comprehensive documentation can be found.

================
Comment at: lldb/include/lldb/Target/TraceHTR.h:27
+/// \class HTRBlockMetadata TraceHTR.h "lldb/Target/TraceHTR.h"
+/// Metadata associated with an HTR block
+/// Metadata is initially populated in Layer 1 and merged as blocks are merged
----------------
clayborg wrote:
> What is HTR? A definition here or in the initial comment block as the top of the file with a URL or full explanation would be great.
Working on comprehensive HTR documentation that I plan to store in `docs/` - will update the documentation in this file to mention where the comprehensive documentation can be found.

================
Comment at: lldb/include/lldb/Target/TraceHTR.h:29
+/// Metadata is initially populated in Layer 1 and merged as blocks are merged
+class HTRBlockMetadata {
+public:
----------------
wallace wrote:
> This is a very expensive structure if you are going to have for Layer 1, which is just a bunch of instructions => m_num_instructions will always be 1 and m_func_calls will have 0 or 1 entries.
> 
> I'd suggest creating two classes: 
> 
> InstructionLayer (layer 0), where you could assume that m_num_instructions is 1 and instead of having the map m_func_calls, you can have an Optional<ConstString> for the function call, or you could also have a map (address -> ConstString) at the layer level that could use to look up function calls instead of storing this information in each instruction. If you follow the latter advice, the memory usage of Layer 0 would have small overhead, i.e. InstructionLayer = Instruction list + function map
> 
> BlockLayer (layer 1+), here you have blocks of 1 or more instructions. Assuming that the number of blocks will be small, you could store more metadata in each block and here statistics like number of function calls make more sense.
Discussed offline - decided on creating an `ILayer` that the `InstructionLayer` and `BlockLayer` will inherit from. `ILayer` will define the interfaces to interact with a layer.

================
Comment at: lldb/include/lldb/Target/TraceHTR.h:41
+  size_t m_num_instructions;
+  std::unordered_map<std::string, size_t> m_func_calls;
+  friend llvm::json::Value toJSON(const HTRBlockMetadata &metadata);
----------------
clayborg wrote:
> Use ConstString here instead of std::string. std::string will make a copy of the string. ConstString objects should always be used for storing function and type names as a constant string pool will unique any strings that are placed into a ConstString. This will avoid consuming more memory for these traces and ConstString also is much more efficient for comparing strings for equality if that is needed anywhere.
I tried using `ConstString` here, but I was running into issues because I don't believe there's a hash implementation for `ConstString` so it's unable to be the key type for an `unordered_map`. I'm considering using a `map` instead which should allow the use of `ConstString`, but at the cost of performance because lookups in these map are done quite often (every time two blocks are merged). What are your thoughts?

================
Comment at: lldb/include/lldb/Target/TraceHTR.h:48-50
+/// Sequences of blocks are merged to create a new, single block
+/// Each block indirectly corresponds to a sequence of "unit" blocks (ie
+/// instructions)
----------------
clayborg wrote:
> Maybe some more complete documentation here would be great to fully explain what this class does and tracks.
Working on comprehensive HTR documentation that I plan to store in `docs/` - will update the documentation in this file to mention where the comprehensive documentation can be found.

================
Comment at: lldb/include/lldb/Target/TraceHTR.h:73
+/// Maps the unique Block IDs to their respective Blocks
+template <class TMetadata> class HTRBlockDefs {
+public:
----------------
wallace wrote:
> clayborg wrote:
> > Does this need to be templatized? Are we using this same class with multiple different TMetadata classes or are we always using it with HTRBlockMetadata?
> +1
> I don't think you should go with templates unless you had more TMetadata objects
Currently, the templates are not needed because we have a single metadata type, but I added the templates to make this design extensible to different use cases that may require different metadata.

Since we currently have just a single metadata type, I'll go ahead and remove the templates and if we ever need to add a new metadata type we can discuss how to extend the design.

================
Comment at: lldb/include/lldb/Target/TraceHTR.h:143-144
+  std::vector<size_t> block_id_trace;
+  template <class T> friend class TraceHTR;
+  template <class M> friend llvm::json::Value toJSON(const TraceHTR<M> &layer);
+};
----------------
clayborg wrote:
> using the "friend" keyword usually means some methods that are private should actually be public. Can you see if you can make some methods public to avoid having to "friend" other classes?
I'm going to add getters to these classes so this should allow for all the `friend`s to be removed

================
Comment at: lldb/source/Commands/CommandObjectThread.cpp:2172
+
+  CommandObjectTraceDumpChromeTraceFormat(CommandInterpreter &interpreter)
+      : CommandObjectIterateOverThreads(
----------------
clayborg wrote:
> Instead of making a new custom command, I would say we should just add a "--export" option to the "thread trace dump" command that can take a export enumeration as its value. Possible values would be "chrome-trace-format" or "ctf". If we architect it correctly, we would create a Trace export plug-in within LLDB to allow other people to create new trace export plug-ins in the future. 
I agree that the allowing exporting to different formats is useful, so I think the trace export plug-in is a great idea.

Discussed with @wallace offline and we decided for this diff we are going to add a `thread trace export` command that has a `--format` option and in a future diff we could add implement the plug-in logic.

================
Comment at: lldb/source/Target/TraceHTR.cpp:56
+  int block_id = 0;
+  std::unordered_map<lldb::addr_t, size_t> load_address_to_block_id;
+  // This flag becomes true when cursor->Next() returns false
----------------
clayborg wrote:
> So each unique address gets added to this map? if we have many instructions that are in the same block like 0x1000, 0x1004 and 0x1008, we will have all three addresses in the map pointing to the same block ID? If we used an ordered_map, we could use lower_bound to find the lowest address that points to the block, and the block could have an address range inside of it that could be updated as we went along. It might end up saving memory if we can do it this way. 
> 
> So if we have:
> ```
> 0x1000: nop
> 0x1004: nop
> 0x1008: nop
> 0x100c: call 0x2000
> 0x2000: nop 
> 0x2004: nop 
> 0x2008: nop 
> ```
> This would end up with two blocks, one whose range is [0x1000-0x1010) (4 bytes after 0x100c), and one from [0x2000-0x200c). If we used a map the map could simply end up containing:
> ```
> 0x1000 -> 1 (Block [0x1000-0x1010)
> 0x2000 -> 2 (Block [0x2000-0x200c)
> ```
> Using an ordered_map would take a bit more logic to always update a Block's address range on the fly and as load addresses come in, like 0x1004, 0x1008, 0x100c, we would need to locate the 0x1000 entry using "load_address_to_block_id.lower_bound(addr)" and seeing if the block's address range has been finalized, and if not, update it, and if it was, just make sure the address is in the block.
> 
> Where as right now the load_address_to_block_id would end up containing:
> ```
> 0x1000 -> 1
> 0x1004 -> 1
> 0x1008 -> 1
> 0x100c -> 1
> 0x2000 -> 2
> 0x2004 -> 2
> 0x2008 -> 2
> ```
> I worry about the size of the load_address_to_block_id data structure if it isn't minimized.
Correct, each unique address gets an entry in this map.

For the example you gave `load_address_to_block_id` would end up containing:
```
0x1000 -> 0
0x1004 -> 1
0x1008 -> 2
0x100c -> 3
0x2000 -> 4
0x2004 -> 5
0x2008 -> 6
```
The goal of this map is to assign each load address a new, unique id. The idea with assigning a new unique id is that this id could be smaller in size compared to a load address (ie 4 bytes versus 8 bytes) since the number of unique instructions in a trace is typically significantly smaller than the total number of instructions in a trace (<1% based on some testing I did) - this obviously would depend on the specific trace but the high level idea is there is potentially an opportunity to save space since it is very unlikely that any useful trace would have more than `2^32 - 1`  unique instructions. 

`size_t` is currently used as the type for the ids so this memory saving idea isn't implemented here, but wanted to get feedback on this idea.

For this diff, I'll go ahead and remove this map as a whole and just use the load addresses themselves as the unique id for a block, but I'd like to keep the idea I described above in mind for future work.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105741/new/

https://reviews.llvm.org/D105741