@price gave me some feedback on Slack#dm-jointcal about how much information is currently output at the INFO level in jointcal. @jbosch and @timj both commented that there has been ongoing SQuaRE work to explore more sophisticated log analysis tools (e.g. logstash); we currently have no guidance on how to extract information from logs, nor on how much information to put at various levels. This results in a tension between those of us who follow the Unix “only print warnings and errors by default” approach versus “print information that could be useful to later error reconstruction or allow others to explore properties of a run without re-running it themselves”. That latter approach requires some judgement about how useful a given output is compared with its frequency.
If we had more intelligent logging tools, we could let our logs be more verbose without things getting lost in the “clutter”; what is clutter and what not is context dependent. This has resulted in some back-and-forth on various tickets with different people claiming a given output should be somewhere between DEBUG and WARN, depending on their particular use case of that code and where they land on the “verbosity spectrum”.
Do we have any writeups about those log analysis explorations, or any further plans for how to manage logs? This could help people try out different approaches to log analysis and give us a sense of just how verbose is “too verbose”.
Related tickets
This is not directly related to the accepted-but-not-implemented RFC-245: that would add an additional log level but we would still have to make judgement calls about whether certain information goes into DEBUG vs. VERBOSE vs. INFO.
This is somewhat related to DM-25942, in that we don’t really have guidance on how to log different types of error conditions. The dev guide has a sentence per log level but that’s open to a lot of interpretation.
There was a similar discussion on Community two years ago (Warnings or other not-errors in the Stack) which resulted in some discussion about logging policy. In that discussion, @ktl said
I’d say the current de facto policy is to use
log.warn
and then rely on the user noticing, given that warning-level log messages are supposed to be prominent in the output, or rely on a workflow system explicitly scanning the log for a warning if that were to be needed.
Warnings are regularly not-noticed (and probably for many cases, they may not be real problems, but the software can’t always know that a-priori), and we have no workflow to scan logs.