Terse Systems

Echopraxia 3.0

04 Jul 2023 • logging

TL;DR Echopraxia is a structured logging API for Java and Scala. I've released Echopraxia 3.0 which has a number of new features, most notably more control over presentation logic, custom typed attributes, better exception handling, and removing hardcoded dependencies.

You can check it out at https://github.com/tersesystems/echopraxia/ or check out the new documentation site.

This is going to be a development log going into technical details, explaining the why behind the how.

Presentation Logic

Echopraxia's API is built around structured logging input. In practical terms, that means when you write:

log.info("{} logged in", fb -> fb.user("user", thisUser));

Then you expect to see a line oriented output that looks something like:

INFO user={<some human readable data here>} logged in

And you expect that the JSON output will be:

{
    "level": "INFO",
    "message": "user={<some human readable data here>} logged in",
    "user": {
        <some machine readable data here>
    }
}

The problem: the user may see irrelevant junk inside of user={...} and really only cares about id and role. The machine-readable data also may want JSON in a particular format – Elasticsearch wants stable field names, stable mappings, and has trouble understanding deeply nested objects and arrays. Or there may be additional data that isn't explicitly called out in the field JSON-LD includes a @type field that is used in typed values, i.e. a timestamp may have a type of http://www.w3.org/2001/XMLSchema#dateTime, but that isn't relevant to the human.

The paradox here is that although structured logging involves packaging arguments into a structured format, the presentation of that data is very different between machine-readable format and "ergonomic" human-readable format. While logfmt is a recognizable and compact format, it's still machine based – it does not care about what is most relevant for a human to see. Meanwhile, from the end user's perspective, there's a loss of utility in rendering structured data: they used to be able to control the presentation exactly with toString, and now they can't.

This issue compounds when we start getting into complex objects and arrays. When rendering an AST, batches of paginated data, or encrypted data, there's an issue of presentation. Should the user see the entire AST, or only the relevant bits? Does the user care about the contents of the batch, or just that it's the right length? Should the user see the unencrypted data, or should it be filtered or invisible?

Presentation Hints

The solution in 3.0 is to add typed attributes to fields. These attributes are used to add extra metadata to a field, so that a formatter has more to work with than just the name and value. Then we can add some presentation hints to specialize fields so that a formatter can decide how to render this field in particular, and extend the Field type to PresentationField with some extra methods to provide those hints.

For example, one of the hints is asCardinal, which renders a field as a cardinal number in a line oriented format. This is most useful for very long strings and arrays:

log.debug("{}", fb -> fb.array("elements", 1, 2, 3).asCardinal());

renders as:

elements=|3|

Other useful presentation hints are asElided which will "skip" a field so it doesn't show in the toString formatter, and abbreviateAfter which truncates a string or array after a number of elements.

Structured Format

While it's nice to be able to customize values, there will be cases where the string we want a human to see is not the string we want the machine to see. Take the case of a duration:

log.debug("{}", fb -> fb.duration("duration", Duration.ofDays(1)));

We want to render this in a human readable format:

"1 day"

But we want to see the ISO duration format in JSON:

{"duration": "PT24H"}

Simply rendering a Value.string won't work here, and overriding toString won't be enough. Instead, we have to provide both human and machine values. We can do that by passing a string with the human value, and using withStructuredFormat for the machine value:

public class MyFieldBuilder extends PresentationFieldBuilder {
  public PresentationField duration(String name, Duration duration) {
    Field structuredField = string(name, duration.toString());
    return string(name, duration.toDays() + " day")
            .asValueOnly()
            .withStructuredFormat(new SimpleFieldVisitor() {
              @Override
              public @NotNull Field visitString(@NotNull Value<String> stringValue) {
                return structuredField;
              }
            });
  }
} 

This withStructuredFormat method adds an attribute that takes a FieldVisitor interface, following the visitor pattern. Here, we only care about swapping out the string value, so visitString is all that's required.

This also covers the case where we want to render extra information or do some transformation for the machine, so we could add @type information for JSON-LD:

log.info("{}", fb -> fb.instant("startTime", Instant.ofEpochMillis(0)));

in text format;

startTime=1970-01-01T00:00:00Z

and in JSON:

{
  "startTime": {
    "@type":"http://www.w3.org/2001/XMLSchema#dateTime",
    "@value":"1970-01-01T00:00:00Z"
  }
}

and we can even cover this for the array case:

log.info("{}", fb -> fb.instantArray("instantArray", List.of(Instant.ofEpochMillis(0))));

produces:

{
  "instantArray": [
    {"@type":"http://www.w3.org/2001/XMLSchema#dateTime","@value":"1970-01-01T00:00:00Z"}
  ]
}

And here's the implementation:

public class InstantFieldBuilder implements PresentationFieldBuilder {

  private static final FieldVisitor instantVisitor = new InstantFieldVisitor();

  public PresentationField instant(String name, Instant instant) {
    return string(name, instant.toString()).withStructuredFormat(instantVisitor);
  }

  public PresentationField instantArray(String name, List<Instant> instants) {
    return fb.array(name, Value.array(i -> Value.string(i.toString()), instants))
            .withStructuredFormat(instantVisitor);
  }

  class InstantFieldVisitor extends SimpleFieldVisitor {
    @Override
    public @NotNull Field visitString(@NotNull Value<String> stringValue) {
      return typedInstant(name, stringValue);
    }

    PresentationField typedInstant(String name, Value<String> v) {
      return object(name, typedInstantValue(v));
    }

    Value.ObjectValue typedInstantValue(Value<String> v) {
      return Value.object(
        string("@type", "http://www.w3.org/2001/XMLSchema#dateTime"), keyValue("@value", v));
    }

    @Override
    public @NotNull ArrayVisitor visitArray() {
      return new InstantArrayVisitor();
    }

    class InstantArrayVisitor extends SimpleArrayVisitor {
      @Override
      public void visitStringElement(Value.StringValue stringValue) {
        this.elements.add(typedInstantValue(stringValue));
      }
    }
  }
}

Field Creation

This covers the cases I can think about, but make it really work it has to be extensible so users can add their own custom methods, i.e asDecrypted() to strip decryption from a value. So instead of returning a PresentationField, we should be need to work with a field as a generic user defined type extending Field.

Field creation in Echopraxia comes down to a factory method:

Field field = Field.keyValue(name, value);

This has to be changed so that it takes a Class<T>:

PresentationField field = Field.keyValue(name, value, PresentationField.class);

This also means that users can modify field creation in general, so it can be extended with metrics, validation, caching, etc.

Field Builders

The other change to Echopraxia is the removal of FieldBuilder as a lower bound on the loggers.

Before, you could do the following using Logger<?> and it would act like it was Logger<FieldBuilder>:

Logger<?> logger = LoggerFactory.getLogger(getClass());

This no longer works in 3.0 and the default is PresentationFieldBuilder, not FieldBuilder:

Logger<PresentationFieldBuilder> logger = LoggerFactory.getLogger(getClass());

If you still want to use FieldBuilder or your own custom instance then you can now pass a field builder as argument (instead of having to calling withFieldBuilder):

Logger<MyFieldBuilder> logger = LoggerFactory.getLogger(getClass(), MyFieldBuilder.instance());

There were two justifications initially for using Logger<FB extends FieldBuilder>: minimizing verbosity and providing a minimal set of functionality for building and extending fields. The first justification is weak, and the second is offset by the assumptions that FieldBuilder makes for the user.

Minimizing Verbosity

At the time that Echopraxia was first sketched out, JDK 1.8 was much more popular. This is no longer the case – JDK 11 is long in the tooth now, and JDK 17 is the standard. The language has evolved, and now has type inference.

This means that if you're concatenating loggers in a method, you'll use var:

public class Foo {  
  public void doStuff(Instant startTime) {
    var log = logger.withFields(fb -> fb.instant("startTime", startTime))
    log.info("doStuff: make things happen");
  }
}

And if you're defining a static final logger, you're not going to be bothered because private static final is already an up front cost:

public class Foo {  
  private static final Logger<PresentationFieldBuilder> logger = 
    LoggerFactory.getLogger(Foo.class);
}

So this is a moot point.

Hidden Assumptions

The other problem with <FB extends FieldBuilder> is that the FieldBuilder interface makes too many assumptions about what the statement writer should know, instead of putting that power in the hands of the developer writing the field builder.

Let's bring it down to a single statement:

log.info("{}", fb -> fb.string("operation", "add"));

I'm still happy with the requirement of an fb handle for constructing arguments here. Ideally, I'd like a magic static import function for the handle:

log.info("{}", import _ -> string("operation", "add"));

Or some kind of magic tuple:

log.info("{} {}", "startTime" -> startTime, "endTime" -> endTime);

But for what Java is, adding an fb. prefix to everything is fine.

Everything after the fb. is not fine, because it makes three different assumptions.

Exposing Field

The first assumption the FieldBuilder makes to expose Field as the return type instead of <F extends Field>.

I've already gone over the problem with hardcoding Field, but it's worth noting because this is baked into the Logger itself. There's no way a developer can swap that out using withFieldBuilder – it's part of the public API.

Exposing Primitives

The second assumption that FieldBuilder makes is to expose the infoset primitives (string, boolean, number), and the infoset complex (array, object) as part of the API, and it also exposes keyValue and value for the underlying Value objects.

This is a problem on multiple levels. It puts the power of definition in the statement, rather than in the field builder. More importantly, it ties the hands of the developer because what's being passed in is a representation, rather than the object to represent.

Simply put, there's no such thing as a string. A string is a textual representation of something meaningful in the domain: a name, an address, a debug representation of a syntax tree. A boolean is a representation of a feature flag, and so on.

So rather than letting the user pass in a string:

log.info("{}", fb -> fb.string("operation", "add"));

The developer could require the user to categorize the data as "program flow":

log.info("{}", fb -> fb.flow("operation", "add"));

or even better, expose a DSL:

log.info("{}", fb -> fb.flow.operation("add"));

The point here is not that this is an ideal DSL, but that the developer should be able to decide how permissive or restrictive the field builder API is. Specifying that the logger extends FieldBuilder is removing that choice.

Exposing Names

The third assumption is that users should provide field names – that is, fb.kv("name", "value") is reasonable.

This seems like a reasonable assumption, especially when you're trying to log several instances of the same type:

log.info("{} {}", fb -> fb.list(
  fb.instant("startTime", startTime), 
  fb.instant("endTime", endTime)
));

However, there are downsides to defining names directly in statements, especially when using centralized logging.

The first issue is that you may have to sanitize or validate the input name depending on your centralized logging. For example, ElasticSearch does not support field names containing a . (dot) character, so if you do not convert or reject invalid field names.

A broader issue is that field names are not scoped by the logger name. Centralized logging does not know that in the FooLogger a field name may be a string, but in the BarLogger, the same field name will be a number.

This can cause issues in centralized logging – ElasticSearch will attempt to define a schema based on dynamic mapping, meaning that if two log statements in the same index have the same field name but different types, i.e. "error": 404 vs "error": "not found" then Elasticsearch will render mapper_parsing_exception and may reject log statements if you do not have ignore_malformed turned on.

Even if you turn ignore_malformed on or have different mappings, a change in a mapping across indexes will be enough to stop ElasticSearch from querying correctly. ElasticSearch will also flatten field names, which can cause more confusion as conflicts will only come when objects have both the same field name and property, i.e. they are both called error and are objects that work fine, but fail when an optional code property is added.

Likewise, field names are not automatically scoped by context. You may have collision cases where two different fields have the same name in the same statement:

logger
  .withFields(fb -> fb.keyValue("user_id", userId))
  .info("{}", fb -> fb.keyValue("user_id", otherUserId));

This will produce a statement that has two user_id fields with two different values – which is technically valid JSON, but may not be what centralized logging expects. And the backend should be able to deal with this transparently to make it work.

In short, what the user adds as a field name should be more what you'd call a 'guideline' than an actual rule, and the field builder should provide defaults if the user doesn't provide one.

For example, if you specify an Address without a name, even if you're just overriding keyValue:

log.info("{}", fb -> fb.keyValue(address));

Then the field builder can use address as the default field name, or even query the Address to figure out if it's home or work.

Exception Handling

The internals of Echopraxia's error handling have also been improved.

One assumed rule about logging is that it shouldn't break the application if logging fails. This has never been true, and SLF4J makes no such promises:

class MyObject {
  public String toString() {
    throw new Exception();
  }
}
slf4jLogger.info("{}", new MyObject());

This will throw an exception that will blow through an appender in Logback, because it calls toString. It's up to the application to wrap arguments in a custom converter.

The problem in Echopraxia is that it can be astonishingly hard to figure out what values are null.

For example, in the AWS Java API, you'll get an S3Object:

class MyFieldBuilder extends PresentationFieldBuilder {
  public Value<?> s3ObjectValue(S3Object s3Object) {
    String key = s3Object.getKey();
    int taggingCount = s3Object.getTaggingCount().toInt();
    return Value.object(keyValue("key", key), keyValue("taggingCount", taggingCount));
  }
}

This will fail with a NullPointerException, because getTaggingCount() returns a null Integer.

Echopraxia makes a best effort and does point out bad ideas, but now there's an actual exception handler for when things go south. The default implementation writes e.printStackTrace() to STDERR and does not log the statement, but you can replace that implementation with something that writes to Sentry or writes a "best effort" log statement out.

Housekeeping

Another housekeeping arrangement is moving the internals of Logger and logging support out of the api package into an spi package.

There's two categories of user: the developer who puts together the loggers, field builders, and the actual framework itself, and then the end users who just want the log statements and maybe some custom conditions. The code needs to reflect that.

My codified knowlede of SPI vs API is spotty, but the general rule I applied was "does the person writing a logging statement need to know about this" – things like Condition, LoggingContext and Level – all qualify as API. More internal things like CoreLogger, Filter, and DefaultMethodsSupport do not.

Compile-Only Dependencies

Finally, 3.0 removes the transitive dependencies on logback and log4j2 framework implementation, so they will need to be explicitly defined as dependencies at installation.

This is mostly because SLF4J 2.0.x and SLF4J 1.7.x do not mix. If you have code that uses SLF4J 2.0.x, it will see Logback 1.2.x and pointedly ignore it with a warning message.

This didn't used to matter, because logstash-logback-encoder used SLF4J 1.7.x, but as of 7.4 support for Logback 1.2 is dropped. Echopraxia makes no guarantees of backwards or forward compatibility between versions: it is best effort at this point. If it gets to the point where Logback 1.2 and 1.4 are simply not reconcilable, I'll create different adapters for them so there'll be logstash1_2 and logstash1_4 backends.

Likewise, while Log4J 2 doesn't depend on SLF4J 2, the Log4J vulnerability warnings and general attention given to analyzing exactly which version of Log4J 2 your framework depends on means that the safest thing to do is to not specify any version at all.

Summary

I hope this gives a good overview of the design decisions and thinking going into this release. I'm really happy with Echopraxia as a whole, and I keep being surprised at how much fun I have both writing it and finding new things I can do with it.