[lldb-dev] logging in lldb

Tue Dec 6 10:14:15 PST 2016

The other problem with using streams is we need each log line to come out in a single chunk. So we somehow need to specify the start and end of a log. For example if two threads are doing:

if (log)
  log->Printf("%i %i %i\n", x, y z);

And we switch over to using:
   if (log) *log << 1 << ' ' << 2 << ' ' << 3 << '\n';

We can't have that come out separately. We need to be able to have the logging know that everything has to come out at once as we don't want to see the following if two threads were logging these lines at the same time:

1  12 2 3 3\n\n

So the stream stuff all now needs to mark where the logging begins and ends:

   if (log) *log << log::start_message << 1 << ' ' << 2 << ' ' << 3 << '\n' << std::end_message;

And that starts to ruin the nice concrete log calls we had with log->Printf() since each call to log->Printf() was a complete message. I am open to ideas on how to solve this. Anyone have any ideas?

Zach, how many people are using llvm::formatv already? Will it be hard to change/evolve the format strings and contents if necessary? Might be hard if there are a lot of clients, but I am just trying to get a feel.

> On Dec 6, 2016, at 9:48 AM, Jim Ingham via lldb-dev <lldb-dev at lists.llvm.org> wrote:
> 
> BTW, I just want to make it clear in this discussion that logging is a really important part of lldb.  This isn't a matter of "you had some debug logging you put in to solve a problem, and you didn't want to have to take it out next time you use it."  We very often get bugs from customers who can't reduce the problem to a small test case, and can't make their code available to us.  So logging is the only way you are ever going to solve this class of problems.  Fortunately the logging in lldb is pretty good at present, and we have all often been able to solve this sort of problem on logging alone.
> 
> As you write code in lldb you should be thinking "If I had to figure out what was going on in some tricky circumstance, and all I had was this log, would I be able to figure it out."  And to this end, being able to output complex log messages is crucial.
> 
> Sorry if this is obvious to folks, but on the off chance it isn't, it's worth a few bytes to restate it.
> 
> Jim
> 
>> On Dec 6, 2016, at 9:39 AM, Jim Ingham via lldb-dev <lldb-dev at lists.llvm.org> wrote:
>> 
>>> 
>>> On Dec 6, 2016, at 8:23 AM, Pavel Labath <labath at google.com> wrote:
>>> 
>>> Hello all,
>>> 
>>> I've been thinking about how to simplify the way we do logging in
>>> lldb. The main two problems I see are:
>>> a) inability to print complex types easily - this was always tricky,
>>> but lately has become even more problematic, as we're starting to
>>> introduce llvm::StringRef, chrono, and other non-C types everywhere
>> 
>> Streaming output tends to chunk the output text into smaller bits.  That's a problem when you are doing "I have a log line from some customer, I have no idea where it is in the code, so I'm going to search for the text in the code."  That actually happens pretty frequently.  The more you chunk up the text, the harder this task is.  This problem exists in the current method especially since the restriction to 80 chars meant that lots of contiguous log lines are now nonsensically broken up into chunks.  It would be nice not to make it worse.  This is more about ways to write log lines, however; more cultural than implementation.
>> 
>> Like Greg, I find streaming harder to read than printf, particularly if I'm trying to make things indent nicely for reading.  So again, I'd argue for adding these macros, but not trying to replace the current mechanism.
>> 
>> 
>>> b) the need to explicitly specify the log source (class and function
>>> names) explicitly - this leads to increased verboseness and
>>> inconsistency
>> 
>> I think having a convenience to add function name decorators is fine, but we need a way not to do it all the time.  We have logs that don't have function names because they log over several functions, and having the different names adds visual noise, would make the rest of the log impossible to line up nicely, and adds no important information.  You will write log statements infrequently, but you will have to read them much more frequently, so taking a little more effort to make them readable is well worth it.
>> 
>>> c) the need to write if(log) everywhere -- also increases visual noise
>> 
>> Sometimes you need to do some preliminary calculations to gather the materials you want to log, so the code in the "if (log)" is not trivial.  It has to be inside the "if (log)" however, since you only want to do logging computations if you are logging.  I'd really rather not have to write this in a macro as that makes it hard to read and harder to debug.  That doesn't argue against adding these macros, but it does argue that we need to keep the old form.
>> 
>> Aesthetically, your form does not make it clear that the logging statements are not getting done if logging is off.  I don't want somebody casually working on the code not to log something because they were unsure of this.  We could address that through documentation, a solution whose goodness is weighted by the faith you have in how many people read this sort of documentation.
>> 
>>> 
>>> I tried to not reinvent the wheel as much as possible, so I decided to
>>> model my proposal based on how gtest logs it's failure messages. It is
>>> not a super-well-known feature of gtest, but you can stream additional
>>> data into the ASSERT statement, in case you think the plain gtest log
>>> message will not provide enough context for the user. For example you
>>> can write:
>>> for (auto test: test_cases)
>>> EXPECT_TRUE(do_test(test)) << "Input was: " << test;
>>> 
>>> I think this pattern has potential to solve all the problems I
>>> outlined above. Usage of streams (llvm streams in our case) enables us
>>> to do any fancy formatting, including the use of recently introduced
>>> llvm::formatv. And the presence of the macro means that we can hide
>>> things like "if(log)" and __FUNCTION__ there. If the macro is written
>>> carefully, the whole log statement will have minimal overhead (the
>>> same as our current solution) if logging is disabled.
>>> 
>>> I have created straw-man implementation of such an interface in
>>> Dxxxxx, together with some examples of how it's used, and I'd like to
>>> hear you comments. I've deliberately made it as a custom layer added
>>> on top of the current implementation, to show how simple the thing can
>>> be -- after converting just half of a single file to the new
>>> interface, the implementation has already "payed off" in the number of
>>> LoC added/removed. If I will be doing the thing it for real, I'd like
>>> make the integration with llvm streams much deeper.
>>> 
>>> So, my question is: Do you like the design in general (if not, why
>>> not?). If you do, then I'd like to hear your thoughts about the
>>> particular details of this design:
>>> - the new implementation prints File::Function as the source, instead
>>> of the conventional Class::Function. The reason for this is that the
>>> the class name is not easily accessible from the preprocessor. I think
>>> this is fine as generally file_name = class_name+".cpp", but possible
>>> implementations I can think of are:
>>> - file+line: I didn't choose this one as lines do not say much, if
>>> you don't have the source file open. They can also change from
>>> revision to revision due to unrelated changes.
>>> - __PRETTY_FUNCTION__: It contains the class name, but can get
>>> pretty messy for complex functions. Possibly we could do some
>>> processing on this value to shorten it (?)...
>>> - Have each (loggable) class contain a static consexpr field NAME__,
>>> and then reference that from the log macro: works, but it's quite
>>> messy
>>> 
>>> - LLDB_LOG() vs. LLDB_LOG(log) -- All our log variables are named log,
>>> so maybe we don't have to reference them explicitly? I couldn't decide
>>> whether that helps or hurts readability. (I wanted to keep the log
>>> variables, instead of specifying the log flags explicitly, as that
>>> means individual log statements can be very cheap, and short.)
>> 
>> I'd rather keep this explicit.  I don't know if we do it ever, but it is totally reasonable that one function might want to funnel information to two different channels.  Also, it makes explicit the connection between the log object and the channel it is opened for.  It is a bit weird to have a seemingly unused log creation line, then a bunch of logs later on that don't look connected to it.  The more oddities you put in the code the harder it is for a new engineer to begin to grok it.
>> 
>> 
>> Jim
>> 
>> 
>>> 
>>> Note, I am not proposing we do a wholesale replacement of the whole
>>> lldb code base, as that will be a huge job -- the new implementation
>>> can live together with the old one quite happily (as it does in my
>>> example). I will certainly go and refactor code I own (linux and
>>> android), and anyone else can do the same but I am proposing a more
>>> gradual migration path, where the log statements are updated as we
>>> touch the affected functions -- I am hoping people will gravitate
>>> towards that naturally, as the new log statements will be shorter and
>>> easier to use with more complex types. After usage of the old logs
>>> drops below a certain percentage, we can go and replace the rest.
>>> 
>>> So, what do you think?
>>> pl
>> 
>> _______________________________________________
>> lldb-dev mailing list
>> lldb-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> 
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev