Using Scalafix to Refactor Logging

Problem: you want to use Echopraxia structured logging in your Scala application, but you already have an existing body of logging statements. Solution: Get scalafix to rewrite the logging statements for you!

For Echopraxia, logging statements are based around a field builder API. Scala has string interpolation, so most of the time logging statements don't have string concatenation. Instead, most logging statements look like this:

logger.info(s"thing=$thing")

What we want is to break thing out into the field builder so it's not using an implicit toString call, and can be seen as a unique field in JSON:

logger.info("thing={}", fb => fb.value("thing", thing))

or, for multiple arguments:

logger.info("some field {} another field {}", fb => fb.list(
  fb.keyValue("text", text)
  fb.keyValue("number", number)
))

And we want to be able to recover from the case where exceptions are swallowed:

logger.error(s"exception=$e") // very bad, will swallow stack trace

and render it appropriately, but only for exceptions using fb.exception instead of fb.value:

logger.error(s"exception={}", _.exception(e)) // will render exception and stack trace!

So, this is not a complex refactoring, but it is more complex than IntelliJ IDEA can do out of the box. This is where Scalafix comes in. Scalafix is a refactoring and linting tool that understands how Scala code is structured semantically, using SemanticDB. The SemanticDB support exposes the abstract syntax tree in a Scala program so that it can be recognized and manipulated generally. Before, the AST was available in Scala macros, but not available outside of compilation – you could use Scala macros to autogenerate code, but you couldn't use it to rewrite existing code. As a result of integrating SemanticDB, Scalafix is capable of managing semantic rules like adding type annotations for explicit result types.

I've been interested in Scalafix for a while, but mostly as an end user, and I hadn't thought about writing Scalafix rules myself. After going through it, I recommend everyone learn how to write Scalafix rules, because they can save you so much time and boilerplate, and are really pretty easy to write.

So how does a Scalafix semantic rule work?

The short version is that Scalafix has an input, and an output. The input is a SemanticDocument that contains a tree made up of all the stuff that makes up a program. And for the output, there's a Patch class that returns… strings.

Seriously, that's all there is. You can remove tokens, but if you're patching things to the program, you're adding chunks of text. Initially, I thought that this was very limited, especially after being exposed to the Scala 3 macro program as data model, but for refactoring it removes a number of headaches.

I recommend going through the tutorial, but here I'll walk through how I built up the EchopraxiaRewriteToStructured rule, starting from scratch. The complete source code is here.

The first thing that needs doing is finding the logger statement. The basic unit of Scalafix is pattern matching, so we can start by printing out some likely programs and seeing what tree nodes look likely.

There's a "web site" way called AST Explorer which lets you paste programs in, but I prefer printing it out inline as I'm refining the pattern matching, using foo.structure (you can also reverse it with foo.syntax):

class EchopraxiaRewriteToStructured extends SemanticRule("EchopraxiaRewriteToStructured") {
  override def fix(implicit doc: SemanticDocument): Patch = {
    doc.tree.collect {
      case el =>
        println("${el.structure}") // prints out structure of tree node
        Patch.empty
    }.asPatch
  }
}

From this, we can determine that the statement logger.debug(s"foo") is represented as:

Apply(
  fun = Select(qual = Name("logger"), name = Name("info")),
  args = List(Interpolate(name = "s", parts = List("foo")))
)

This gets at the name, but we also want to check that we're not just latching on to anything called logger – it also has to be of type com.tersesystems.echopraxia.plusscala.Logger.

To do this, we have to get the qualifier's symbol information out, and then pattern match on the signature. We can do this in Scalafix by calling qual.symbol to get the symbol out, and then pulling the SymbolInformation to get at the signature. Once we have the signature, we can use SymbolMatcher to check the Logger symbol against the TypeRef.

Long story short, it looks like this:

class EchopraxiaRewriteToStructured extends SemanticRule("EchopraxiaRewriteToStructured") {
  val loggerClass = "com.tersesystems.echopraxia.plusscala.Logger"

  override def fix(implicit doc: SemanticDocument): Patch = {
    doc.tree.collect {
      case logger @ Term.Apply(
            Term.Select(loggerName, methodName),
            List(Term.Interpolate(Term.Name("s"), parts, args))
          ) if matchesType(loggerName) =>
        Patch.empty
    }.asPatch
  }

  private def matchesType(
      qual: Term
  )(implicit doc: SemanticDocument): Boolean = {
    val loggerSymbolMatcher = SymbolMatcher.normalized(loggerClass)
    val info: SymbolInformation = qual.symbol.info.get
    info.signature match {
      case MethodSignature(_, _, TypeRef(_, symbol, _)) =>
        loggerSymbolMatcher.matches(symbol)
      case other =>
        false
    }
  }
}

Now that we have a relevant logging statement, it's time to rewrite it. We can do this using Patch.replaceTree, which will replace the args inside the Apply node.

Patch.replaceTree(logger, rewrite(loggerName, methodName, parts, args))

Rewriting the code is… a string. The parts are always Lit.String, so calling lit.value.toString and sticking "{}" in between is the easiest way to parameterize them. Then, it's time to serve up the rewritten logging statement as s"""$loggerTerm.$methodTerm("$template", fb => $body)""", and account for some edge cases:

class EchopraxiaRewriteToStructured extends SemanticRule("EchopraxiaRewriteToStructured") {
  // ...
  private def rewrite(
      loggerTerm: Term,
      methodTerm: Term,
      parts: List[Lit],
      args: List[Term]
  )(implicit doc: SemanticDocument): String = {
    if (args.isEmpty) {
      val template = parts.map(_.value.toString).mkString("{}")
      s"""$loggerTerm.$methodTerm("$template")"""
    } else {
      val template = parts.map(_.value.toString).mkString("{}")
      val values = args.map {
        case arg: Term.Name =>
          if (isThrowable(arg.symbol.info.get.signature)) {
            s"""fb.exception($arg)"""
          } else {
            s"""fb.$fieldBuilderMethod("$arg", $arg)"""
          }
        case other =>
          // XXX I don't think this is possible?
          s"""fb.$fieldBuilderMethod("$other", $other)"""
      }
      val body =
        if (values.size == 1) values.head
        else s"""fb.list(${values.mkString(", ")})"""
      s"""$loggerTerm.$methodTerm("$template", fb => $body)"""
    }
  }
}

So hang on a sec… how do we know an argument is a throwable?

This is where it gets really interesting, because this is where we start running into the limits of Scalafix. Scalafix can look at the structure of a type, but does not expose the subtyping information of a type. This is a problem, because exceptions rely heavily on subtyping to work.

However, there is a hack that we can try. From poking at the issues, we can try java runtime reflection to instantiate the class, and see if it's assignable from Throwable. I don't love the manual hacking on the symbol to kludge it into a fully qualified class name, but it'll work.

class EchopraxiaRewriteToStructured extends SemanticRule("EchopraxiaRewriteToStructured") {
  // ...
  def isThrowable(signature: Signature): Boolean = {
    def toFqn(symbol: Symbol): String = symbol.value
      .replaceAll("/", ".")
      .replaceAll("\\.$", "\\$")
      .stripSuffix("#")
      .stripPrefix("_root_.")

    signature match {
      case ValueSignature(TypeRef(_, symbol, _)) =>
        val cl = this.getClass.getClassLoader
        try {
          classOf[Throwable].isAssignableFrom(cl.loadClass(toFqn(symbol)))
        } catch {
          case e: Exception =>
            false
        }
      case _ =>
        false
    }
  }
}

Finally, let's add some configuration so that we can account for custom loggers and custom field builder methods. This is very simple: plop down a Config and a withConfiguration method and we're pretty much done:

import metaconfig.{ConfDecoder, Configured}
import metaconfig.generic.Surface

// ...

class EchopraxiaRewriteToStructured(
    config: EchopraxiaRewriteToStructured.Config
) extends SemanticRule("EchopraxiaRewriteToStructured") {

  private val loggerClass: String = config.loggerClass
  private val fieldBuilderMethod: String = config.fieldBuilderMethod

  def this() = this(EchopraxiaRewriteToStructured.Config())

  override def withConfiguration(config: Configuration): Configured[Rule] =
    config.conf
      .getOrElse("EchopraxiaRewriteToStructured")(this.config)
      .map { newConfig => new EchopraxiaRewriteToStructured(newConfig) }

  // ...
}

object EchopraxiaRewriteToStructured {
  case class Config(
      loggerClass: String = "com.tersesystems.echopraxia.plusscala.Logger",
      fieldBuilderMethod: String = "value"
  )

  object Config {
    val default = Config()

    implicit val surface: Surface[Config] =
      metaconfig.generic.deriveSurface[Config]
    implicit val decoder: ConfDecoder[Config] =
      metaconfig.generic.deriveDecoder(default)
  }
}

And that's it! I hope this shows how simple and straightforward refactoring in Scalafix can be. For extra credit, I also have EchopraxiaWrapMethodWithLogger that will wrap a method in a flow or trace logger.