I know I said I'd work on logging APIs this week… but I'm still cleaning up some code before it's ready to present. So this is an interlude post.
The question today is the deceptively simple "how do I log an exception", which turns out to be interesting on multiple levels.
Canonically, the way you log an exception in SLF4J is to log a domain specific message, and include the exception as the last parameter. You can include parameters and use SFL4J templates, so having multiple arguments is fine.
However, this isn't the end of the story. When you render the exception in an appender, you have to decide how to render the stacktrace. The stacktrace is typically voluable, and will cover most of your console and your file in stack trace elements.
Typically, you don't actually care about most of the stacktrace in this case. There's nothing you can do about anything in the Xerces stack, and it's not relevant. You care about the message in the exception. This is especially relevant in appenders where real estate (or disk space) is at a minimum: for example, the exception may not be written to the console at all.
So, if you want to see the message written as:
Then you should call
e.getMessage() and append only the message.
I think this is common practice. I've only had two objections to the practice of adding
e.getMessage when logging. The first objection was that the exception could come from an unknown source, and so the runtime cost of calling
e.getMessage was unknown and could result in 20 MB of /dev/random data, if it didn't risk crashing, halting, blocking, the program altogether.
This is not a theoretical problem! I've personally seen harmless code block entire applications because they called
obj.equals only to discover that java.net.URL makes blocking network calls on hashCode and equals.
I think you have to trust that
getMessage returns a message and is not pathological – if you don't trust an exception to be safe, then the only information you can extract from it is that it exists and has a certain type: you can safely call
e.getClass().getName() on it.
The other objection was that calling
e.getMessage was duplicating information that was already in the stacktrace. I think this is a confusion: a message is domain specific diagnostic information for the user. A stacktrace consists of the message and stacktrace elements that are not domain-specific. Printing out the message is relevant even when the stack trace is not: it's a summary.
However, this does point the way to a deficiency in using
e.getMessage(): it doesn't print out all the diagnostic information available to the stacktrace. Specifically, it doesn't deal with chained exceptions, which may have several exceptions nested deep. Using
e.getMessage() only prints out the top level – this is especially annoying when a useless
java.util.concurrent.CompletionException hides the real exception underneath it.
What I'd like to see is something that runs through all the messages and sticks them together, without having to wade through stack trace elements. So if I write up something like this:
Then I should get a message like this:
We can make everyone happy by rendering the message using a custom converter. This means that the cause happens automatically, on every exception, and by deferring rendering to another thread and managing the thread pool, we can also reduce the impact of blocking operations.
The converter looks like this:
and the implementation looks like this.
getMessage, which is a strength, and it also means that rendering to console can be strictly line based, which means pasting things into Slack is that much easier.
However, this only covers
getMessage. This doesn't cover anything else. If the exception contains vital information but doesn't put it in
getMessage, then you're essentially blind to it. This is especially annoying when it comes to the AWS Java API, because AWS exceptions are extremely rich in useful domain information –
serviceException.requestId() – but rely on the caller to know to extract and present this information.
So, if you want to render an AWS exception then you still have to call
kv structured arguments to expose that information in a structured format.
This leads to an interesting set of problems. We want to extract relevant information from the exception and this is only available per exception type. This can be demonstrated with
This gives us a better result:
This isn't a complete solution, because it has the same problem as using
e.getMessage() in that it still only renders the attributes of the top level exception. If there's an
SQLException halfway through the stack, you won't see the error code or the SQL state. In order to do this properly, you'll have to run the exception mapper through the complete chain and potentially namespace the key value pairs into the relevant exception context if the keys conflict.
Also, the multiple catch blocks for
SQLException are awkward. They are not disjoints, so you can't use a union, and they must be defined in order. I'd much rather do this with Scala style pattern matching, or at least a
Map<Class, ExceptionMapper> that could find the most specific exception.
Finally, this is a manual process. It has to be added as code to every logging statement, and can't be added to third party code.
The correct place to put this would be a custom converter, but I don't feel like getting into it now, and the tricky bit would be populating the mapping, which is domain specific. Still, I think this points out where exception logging could be made more useful.
One thing that popped up in this is that
logstash-logback-encoder doesn't provide a way to render an exception as structured data. This means that it's not so easy to parse and extract message information from stacktraces. Theres no
jq style slicing and dicing of JSON logs. At best, you have sed, awk, and regex.
Logback steno renders stacktraces as a multiline string inside of a JSON field, which I'm not so concerned about. Most of the tools that allow you to do something with stacktraces, i.e. the many companies devoted to error tracking and the IDEs like IntelliJ IDEA, all work by parsing and recognizing the stacktrace in the multiline text format. The log4j2-logstash-layout library does render stacktrace information as structured data, but I don't know of anything that would consume it. As such, I don't think there's any advantage in representing a stack trace in JSON.