[lldb-dev] logging in lldb

Pavel Labath via lldb-dev lldb-dev at lists.llvm.org
Fri Dec 9 04:19:07 PST 2016


There were additional comments on the lldb-commits thread. I am moving
them here and responding:

Jason Molenda wrote on the lldb-commits thread:
> A couple of thoughts / two cents.
> I don't mind the current if (log) { ... } style of logging, even with the PRIx64's and having to do filepath.GetPath().c_str() and all that. I like being able to do extra work in a if (log) block of code -- create a SymbolContext for something or whatever, and as I'm debugging that section of code I like to be able to put a breakpoint there and look more closely. If this is all in a preprocessor macro, a breakpoint is going to stop me on the if() and then I'm trying to step through a macro? And if my computation is down in a lambda I need to figure out how to put a breakpoint in that or step into that somehow to look at the values being computed. How do I create a SymbolContext my_symctx(...); and then include that in what I log if I have to do all of this in the formatv arguments?

I am not proposing we completely eliminate the ability to do have
blocks of code that run only when logging is enabled. I recognize that
there are situations where that is necessary. However, 99% of our
current logging code does not do that, and I am trying to reduce the
footprint of that. So, I don't think you have to worry about this use
case.

> I'm not thrilled with the formatv reinvention of format specification. The printf formatters are a bizarre little invention, but it's a bizarre invention that we've all worked with for decades and everyone who works in C knows well. The formatv format specification may be superior, but I don't want to see innovation here, I want clarity with the linga franca that every C programmer knows. I think the barrier to adopting something non-standard is very high.
> The formatv format specification string seems to explicitly mention the ordering of the arguments, like llvm::formatv("{0} {1} {2}", first, second, third). I'm guessing this allows you to print the arguments in different order than they appear? Like llvm::formatv("{0} {2} {2} {1}", first, second, third)? What's the benefit of this vrs. the uncertainty of which arguments are used in different parts of the string (short of counting them out by hand)? If I have a formatv format specification like "{0} {1} {2} {3} {4}" and I want to insert an argument between 2 and 3, I need to renumber 3 and 4 now? Or do I do "{0} {1} {2} {5} {3} {4}" and throw my argument on to the end?
> Simply *ability* to put the arguments in any order is my concern. Of course I'm never going to do it, but I'm going to have to work on code where other people have done it.

I think being able to refer to arguments directly i'a useful feature
to have in some circumstances (printf also allows it, although with a
syntax I always have to look up when I need it), but I agree it should
be used sparingly. I don't think anyone will put the arguments out of
order just for the fun of it. We can make that an official policy
somewhere, if you're worried about it. I guess one of the reasons for
the simplicity of the design was to only support the direct-reference
versions (I admit I originally also though I could just write {}, and
get the arguments in order).

FWIW, if I had a printf statement with more than five arguments, I'd
also have to count them by hand to make sure I got them matched right,
especially if PRIx64's get involved.

> I don't personally want to see the file / function / line numbers in my logging. It's fine if that's an option that other people can enable if they want it. There's going to be a growth in the size of our text for the constant strings for every log point having its filename and function, but it will probably not be significant (maybe the strings can be coalesced, I wouldn't bet one way or the other.)

I am not sure what what be a good way to try this out, but I wouldn't
be surprised if the overall size effect of this was a reduction in
binary size -- the thing is, most of our logging statements contain
this information already, but it's a part of the format string, so it
definitely cannot be coalesced. If they were referenced as __FILE__,
any sane compiler/linker will merge those references into a single
string constant.

I am not sure which part of the code base you are working with mostly,
but in my experience most code adds file/function information already
(RegisterContextLLDB seems to be the exception), and you have no way
of turning it off. Currently we are going toward making that optional,
so this would be an improvement in that regard as well.


> If formatv could take something a LOT closer to printf, I'd be better with it. Something like the python string formatters would be a real improvement - just %d for base 10 numbers, not having to worry about the size of the argument in the format string, for instance, and would still be readable & understandable by any C programmer out there.
Funnily enough, the inspiration for the syntax  actually did come from
python, although not the % operator, but format() function (which is
also the "new" way to do formatting in python). It was chosen because
it is more powerful. I personally wouldn't go too overboard like some
of the proposals floated around here, but I think in the common cases,
the syntax is simple enough -- instead of %d you write {0}. If you
happen to want hex, you write {0:x}. And the ability to format a
StringRef without jumping through massive hoops is a killer feature
for me.

> If we're going to start using this alternate logging scheme, eventually there will be a push to use it everywhere, or someone will go through all the other classes in the code base and switch it out without consultation. We need to all sign on to this idea before it goes into the code base at all.

This is why we are discussing it here. Does my response alleviate your concerns?


Zachary Turner wrote on the lldb-commits thread:
> Personally I'm not really a fan of source/line information at all.  There is only a very small probability that anyone using a released LLDB (e.g. through Xcode) is going to have their log lines match up with ToT, because we're changing files all the time.  You change one thing on line 10, and now every other log line that comes after it, potentially tens of thousands of lines in the same file, are going to have their line information out of sync.

This is the reason I propose to not embed (and assume the reason that
the current log lines don't do it either) *line* information into the
logs. However, file+function are changing much less frequently, and
they are helpful, as often you just want to say "I've reached this
place, with these variable values". This way, you don't have to write
the first part out explicitly. Even if the function name changed
during a refactor, it's still likely you will be able to find it under
a similar name.

cheers,
pl


More information about the lldb-dev mailing list