[llvm] [symbolizer] Empty string is not an error (PR #92660)

Fri Jun 7 12:48:48 PDT 2024

spavloff wrote:

> > > A simpler proposal: `ECHO` is only supported when not passing `--obj`, in which case it is unambiguous.
> > 
> > Is there a reason why users, who don't pass `--obj`, do not need this facility?
> 
> As I understand it, the functionality is only needed by a tool to detect if llvm-symbolizer has entered interactive mode and is ready to receive input. I believe this is only possible when `--obj` is specified.

If `--obj` is not specified, `llvm-symbolizer` will try reading binary name from input, together with an address. This is a normal use case and users may want check the tool liveness as well. IIRC this is the mode used in compiler-rt tests.

> > > To be treated as a command, the argument must have at least one additional space after the ECHO
> > 
> > It would be a case of secret symbol, invisible but significant.
> 
> I don't think it would be any more than other instances of options that take an empty string as the value. It's certainly not a "secret symbol", given that we'd document such a behavior.

Imagine a user who tries to reveal a problem using logs. The same (visually) commands produce different results. Yes, documenting this behavior must help, but who reads documentation? If possible, simple actions must be implemented with simple, "obvious" commands, that would not require studying documentation.

Empty string is not the best solution. It is also invisible, but it also is obvious. Users expect some harmless reaction on pressing Enter.

> What do we do about `DATA`, `CODE` etc? They'll have just the same set of ambiguity issues as `ECHO`, right?
>

All these commands require an argument, separated by one or more spaces. It cannot be a symbol name, at least `addr2line` drops everything starting from a space. If `CODE` is used without arguments, it is treated as a symbol name, not a command.

`ECHO` does not have arguments, this causes ambiguity.

> To be clear, as I understand it adapting to the `??:0` reply is not the issue: adapting for the stderr printout is the issue - this is more significant than adding a little extra to the input script.

Exactly!

> Just to be clear, my issue with this particular solution is that it special cases an empty string argument, which I'd prefer to avoid. I'm not strongly opposed to it, but would prefer an alternative approach if we can find one that has consensus.

Agree, if we could invent something suitable instead of empty string, it could be a solution.  The only possibility that comes to mind is ECHO with mandatory argument. It does not seem obvious and may be a source of troubles for a user, it they omit the argument, because in this case `llvm-symbolizer` would silently treat it as a symbol name.

> Incompatibility with addr2line is a non issue: there are several other ways in which llvm-symbolizer consumes input/produces output that is not compatible with addr2line. Specifically printing/echoing an empty string (or more correctly, a blank line, i.e. `'\n'` not `''`) is also not a goal, as I understand @pcc's comments, as long as something is printed.

Yes, `llvm-symbolizer` does not need to follow `addr2line` exactly and it does not. It is not a problem. But echoing empty line looks uncomfortable - a user issues invisible command and gets invisible responce. Something visible would be better.

Maybe we could consider this functionality from two viewpoints. First, the reaction on blank line. It cannot represent a symbol, so printing error message on stderr probably is not right. Second, a way to check if `llvm-symbolizer` is working. A special command `ECHO message` could be introduces, that prints the specified message. What about such solution?

https://github.com/llvm/llvm-project/pull/92660