Blindsight vs Echopraxia

Very small blog post as I am noodling and working on a new release of Blindsight.

I've said before that Blindsight "supports" structured logging. Echopraxia "requires" structured logging. It occurs to me that this is really a bit backwards.

Structured logging typically talks about the output of logging: mapping whatever data you have in the logging event into JSON. But this doesn't talk about the inputs – how you provide the JSON with something to chew on. More accurately, we should call most structured logging "structured output" because the output is structured even when most of the input isn't.

So what is a structured input? A structured input is a reliable key/value pair, where the key is typically a string.

For a long time in SLF4J, MDC was the only way to reliably establish a key/value pair in SLF4J, but it wasn't complete because it could only take a string as the value. Then logstash-logback-encoder added event specific custom fields, but the value was still java.lang.Object and it is not a consistent structure – for example, you can't specify a StructuredArgument as the value of another StructuredArgument, and building up a complex semi-structured object is not possible.

Blindsight gives you the option of providing structured input using an Argument with DSL, and does have a consistent structure. But Blindsight doesn't require that of you. You can mix and match structured and unstructured input, and it's fine:

import com.tersesystems.blindsight.DSL._
logger.info("unstructured = {} structured = {}", "string", bobj("instant" -> Instant.now))

Echopraxia requires all input to have structure, by converting input into Field instances through a FieldBuilder and instead of varadic arguments, there's a FieldBuilder => FieldBuilderResult function. For the Scala API, it looks like this:

logger.debug("{}", _.keyValue("foo" -> "bar"))

So why require structured input?

The big answer is that structured input is valuable for developing in the large. By and large, structured formats are the norm in any kind of service: Protobuf, Avro, Parquet, HTTP parameters, and so on. Being able to carry structure over and through into logging adds coherence and allows logging-specific serialization of complex objects.

The more detailed answer is that once you can rely on structured input, you can query and filter your log events vastly more effectively. You can also choose how to render fields in the event, not just in JSON but also for line oriented encoders. For example, you can say %fields{$.request_id} and render only request_id value using a custom converter with a pattern encoder in Logback:

<configuration>
    <conversionRule conversionWord="fields" converterClass="com.tersesystems.echopraxia.logstash.FieldConverter"/>    
    <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
        <encoder>
            <pattern>
                %-4relative [%thread] %-5level [%fields{$.request_id}] %logger - %msg%n
            </pattern>
        </encoder>
    </appender>

    <root level="DEBUG">
        <appender-ref ref="STDOUT"/>
    </root>
</configuration>

Can you do this with unstructured input, or with a mix of structured and unstructured input? Sort of. Imagine your input is a list of random Object with the only real guarantee that toString will return a String. If you are given an object that contains an array, how do you query and filter on that component? You have to explictly cast to the type, and then query on it. This is very difficult to do from inside a Logback filter, which is usually kept apart from the domain classes.

So my argument is that it's a trade off. Blindsight takes a permissive approach and requires type safety, but does not require structure. Echopraxia takes a stricter approach and requires both type safety and structure, with more control over output.