I've written a logback appender that uses the Honeycomb Event API. It's pretty great!
When To Use It
You should use Honeycomb for observability, i.e. the "unknown unknowns." The advantages of using Honeycomb is that it gives you an easy way to throw your logs at a wall and see what looks interesting. The entire team can look at the same logs, pick things out, and see what's going on.
In addition, with Honeycomb you can trace your operations very easily: this gives you a way to see your code flow, and also do some lightweight profiling to find which bits of your code take the most time.
When Not To Use It
I don't recommend using this as your primary means of logging in production: I think writing JSON to Flume/Kinesis Firehose/Aggregator and then sending aggregated data through a lambda is more scalable approach.
When you're debugging a problem on your laptop and you already know the details of what's going on, then Honeycomb is less useful. If you're looking at debug logs from an underlying library, then it's less useful. You can see the raw logs as a file just as easy as in Honeycomb, and the actual data is minimal. Honeycomb excels when you have structured data that you can sort and filter on, and unfortunately the underlying library debug output tends to come in free-form text format, so you're still generally better off using ripgrep or dumping JSON into PostgreSQL and investigating it there.
There is another drawback to using the Event API, which is that you can't do dynamic sampling the same way that you would do with honeytail. I think this is okay, given that you can implement your own dynamic sampling according to your own needs, and use a logback filter to manage it for you.
How To Add It
Tracing with Honeycomb
One of the more interesting bits of the honeycomb API is that you can do tracing just by adding a few extra JSON fields to your logs. All you need is
name, and you'll be able to see data in the Honeycomb span view.
The way you do this in Logback is through
com.tersesystems.logback.honeycomb.SpanInfo.Builder, which puts these together, and then convert through the
SpanMarkerFactory, which turns them into an aggregated LogstashMarker:
At the end of it, you get this:
This tells you exactly how long your code took, and it shows you all of the information in a tracing based hierarchical format and in a straight flat file / JSON format, with few lines of code.
Why This Is Better Than Opentracing / OpenCensus
This is something you can't get from the Opentracing API – you can send opentracing information to SLF4J, but you're still using the OpenTracing API to do it. The Opentracing API requires you to activate spans, pass them between threads, render error messages using semantic conventions. This means writing things like:
I'll admit I'm biased, but this just looks fiddly and easy to get wrong. I'd expect something like
span.error(message, e) and the details to be done under the hood. Yes, you don't have to pass the scope around between methods because it's in a
ThreadLocal, but you can just as easily stick
SpanInfo in a
ThreadLocal, and you'll have to pass span information explicitly in any case when you start using asynchronous
CompletableStage style APIs.
Opentracing also has larger performance requirements than Honeycomb. To see tracing information, you'll need something like Jaeger All-in-One Docker Image, and then you'll run Jaeger agents in Kubernetes sidecars. It's a decent solution for distributed microservices in production – for single-app development it's like using a mallet to squash a fly.
Next steps, I'm going to add span information to logback-bytebuddy so I can instrument that code and have it show up in Honeycomb from config. I've already got configuration driven instrumentation, so it's probably just about keeping a stack of spans around and then you'll be able to see Java innards in a trace.
I'm still working on the series of logging API blog posts, but you can start seeing the bones of it already in the
SpanInfo code already: most of it is about marshalling data before turning it into markers and sending it through SLF4J under the hood. For the
SpanInfo code, just using Google AutoValue and Google AutoService takes care of the boilerplate work.