Terse Systems

Fixing the Most Dangerous Code in the World

| Comments

TL;DR

Most non-browser HTTP clients do SSL / TLS wrong. Part of why clients do TLS wrong is because crypto libraries have unintuitive APIs. In this post, I’m going to write about my experience extending an HTTP client to configure Java’s Secure Socket library correctly, and what to look for when implementing your own client.

Introduction

I volunteered to implement a configurable TLS solution for Play’s web services client (aka WS). WS is a Scala based wrapper on top of AsyncHttpClient that provides asynchronous mechanisms like Future and Iteratee on top of AsyncHttpClient and allows a developer to make GET and POST calls to a web service in just a couple of lines of code.

However, WS did not contain any way to configure TLS. It was technically possible to configure TLS through the use of system properties (i.e. “javax.net.ssl.keyStore”) but that brought up more messiness — what if you wanted more than one keystore? What if you needed clients with different ciphers? Sadly, WS isn’t alone in this: most frameworks don’t provide configuration for the finer points of TLS.

Added to that was the awareness that SSL client libraries have been dubbed The Most Dangerous Code in the World (FAQ). The problem is real and serious, and I want to fix it.

There is also a long and well known gulf between the security community and the developer community about the level of knowledge about TLS and the current state of the HTTPS certificate ecosystem. I want to fix that as well, and this blog post should be a good start.

So. Here’s what I did.

Table of Contents

For the sake of readability (i.e. avoiding TL;DR), I’m breaking this across several blog posts.

The pull request is on Github and you are invited to review the code and comment as you see fit.

In this blog post, I’m just going to cover the setup.

First, the problems that make TLS necessary.

  • The First Problem: Programmers do not get security
  • The Second Problem: Wifi / Ethernet is not secure
  • The Third Problem: Man In the Middle
  • Digression: Mitigation and General Security
  • More Videos and Talks

Then, the implementation in WS.

  • The Use Cases for WS
  • Understanding TLS
  • Understanding JSSE
  • Configuring a client
  • Debugging a client
  • Choosing a protocol
  • Choosing a cipher suite

Future posts will discuss certificates in more detail, but this gives us somewhere to start.

Problems

The First Problem: Programmers do not get security

The first problem is the assumption that TLS is overkill, built by researchers to protect against an abstract threat.

Unfortunately, this is not the case. TLS has real attacks against it, and they exist in the wild. Even worse, there are very serious, real world implications to breaking a TLS connection. Some people trust TLS in situations which could mean imprisonment or death.

But. Programmers work with bugs. Programmers get bugs. Programmers do not get security.

Programmers understand how bad input can ruin a programmer’s day. Programmers understand how corrupt data can completely ruin any hope of a functioning program. Programmers know that working with concurrency is so dangerous that it should only be done with special concurrency primitives and rules. Human users may be incompetent, but they are mostly benevolent: the forces working against the programmer are entropy and loose requirements.

Programmers don’t usually write programs that have to defend against an attacker. Most programmers have never even seen an attacker. Even the concept of a human deliberately trying to break or subvert a program is foreign. QA usually tests for successful cases, and maybe for some negative test cases… but QA typically doesn’t submit specially crafted XML documents that poke at the filesystem or chew up gigabytes of memory with character entities.

If programming is like driving a car, then the difference between working with QA versus working against an attacker is the difference between driving in rush hour versus driving with someone determined to run you off the road. TLS, and people using TLS based clients, have to assume that someone is going to try to run them off the road.

It also helps to see what an attack is like. For most programmers, an attack is theoretical, even a bad joke in poor taste. It doesn’t really become real until an actual attack is demonstrated by a security researcher in front of the programmer in question.

With that in mind, I’ve included several videos from real live security professionals in this blog post. You don’t have to watch them all at once, but you should watch them all eventually and see how they think.

The Second Problem: Wifi / Ethernet is not secure

TLS has a specific problem that most programmers do not have to deal with. TLS has to assume an attacker has access to the TCP/IP stream between client and server. This is commonly called packet snooping.

It is trivial to snoop on other computers in your network using tools like Wireshark. This is doubly true when using wireless networks, such as a coffee shop. People are often surprised that every single website they visit is being broadcast in the clear, in the same way as radio, but it’s true, and it’s easy to pick up those radio transmissions.

But don’t take my word for it. Here’s the Wifi Pineapple:

You can buy a Wifi Pineapple for $100 plus shipping.

It picks up all traffic sent over a wifi network. It’s so good at intercepting traffic that people have turned it on and started intercepting traffic accidentally.

You can plug it into an ethernet adapter, or there’s an ultra bundle that provides a huge antenna and an elite version that lets it run off a battery for 72 hours.

Don’t think that WPA protects you from this. WPA2 is vulnerable to bruteforcing and most people choose passwords with extremely low entropy, partly because wifi passwords are shared so often.

The Third Problem: Man In the Middle

Not only can an attacker sniff packets on the network, but the attacker can also substitute traffic. Here’s a video of Cain & Abel at work:

Note that it takes less than 20 seconds to impersonate the server, after which the attacker can modify any URL coming from the server to point somewhere else. This is why rendering a login page in HTTP is essentially no protection at all: by the time the page is rendered, the attacker can make the HTML form send to a completely different URL.

This attack is called Man in the Middle. (For an in depth demonstration of MITM attacks, checkout the presentation Hey, I just middled you, and this is crazy.)

This is not a theoretical attack. It has been automated to the point where rewriting web pages on the fly is fairly trivial — if you go down to your local hackerspace and browse the Internet without using a VPN, at some point you will find all your images URLs are pointing to Goatse.

Google has been subject to a host of attacks, from Iran. Note that the attack happened at the backbone, not at a particular coffee shop. Advanced persistent threats (APT) can include large nation states as well as script kiddies.

Encrypting data ensures that an attacker cannot read plaintext over the network, using public key encryption. However, the client still doesn’t know the identity of the server it’s trying to connect to. This is a problem. If you don’t verify the identity of the machine you’re talking to, then you could be talking to anyone.

Digression: Mitigation and General Security

Since first writing this, I’ve had some people for whom this has been their first exposure to just how insecure the Internet really is. So, before I head into some serious technical nerdery, I’d like to point out some good general resources for end users.

My Shadow shows the digital profiles that are exposed when you use the Internet, and shows how an attacker can leverage that information in aggregate. Often, rather than breaking encryption, the attacker will collect online facts about you, phone up tech support, and attempt to convince technical support to change your password.

Tactical Tech has a list of programs which are known to have reasonably good security.

Security in a Box discusses the operational security practices (also known as OPSEC) needed to use these tools effectively.

ONO Robot is a series of videos detailing how to use websites safely and securely (choosing good passwords, limiting cookies, etc).

Finally, Eva Galperin of the EFF gave a talk about guides to security in general, and what can be at stake for people who rely on these guides:

Other Videos

If this isn’t enough for you, then clicking through the following videos ought to change your mind.

The Sorry State of SSL by Hynek Schlawack:

SSL/TLS Interception Proxies and Transitive Trust by Jeff Jarmoc:

Cryptography is a Systems Problem by Matthew Green;

The Use Cases

So that’s the threat model. Now the use cases for WS.

WS is a web services client, intended for asynchronous, non-blocking programmatic access to services using HTTP. Most clients will be RESTful with either a small (4k) XML or JSON payload or continously streaming data. Clients will only connect to a few well-known servers. Use of WS for general browsing or indexing a website is possible, but not the focus.

Client connects to internal WS service

In this use case, the client is talking to a service which is not publically available. The client and server will use private certificates (the “moxie option”), use a PKI management solution like DigiCert or OpenCA, or use an internal root CA.

The client may use mTLS / client authentication to connect to the internal service as an additional security measure.

The server will most likely support TLSv1.0. TLSv1.2 support is unlikely given that it does not come out of the box with ngnix and other clients. It is likely that the server supports RC4 ciphers.

Client connects to external WS service

In this use case, the client is talking to an external WS service, which is not owned by the organization and exists on the public Internet. The server may have a self-signed certificate, but is more likely to have a public certificate signed by a certificate authority.

Public facing “webscale” services using HTTPS are likely to support TLSv1.2 and support good ECC ciphers.

Client connects to public internet

In this use case, the client is calling up random URLs given to it by the web service and storing the content. This is a behavior of RSS feed web applications, which require connections to scrape and process data but do not typically analyse the contents.

The server could be anything. This is not the primary use case, and so the defaults will not be tuned for maximum compatbility with unknown or misconfigured servers.

Understanding TLS

This is going to be extremely abbreviated, but let’s give a refresher anyway. Adapted from Zytrax’s SSL Survival Guide (which also has some excellent sections on X.509 certificates):

TLS has four components: authentication, message integrity, key negotiation and encryption.

The client and the server begin a handshake.

The server sends a certificate and the chain of certificates leading back to a root certificate authority. The client should perform certificate and chain validation, making sure the chain terminates to a root CA trusted by the client.

In mutual TLS or client authentication, the client also sends a certificate to the server. This is rare, and most communication just authenticates the server to the client.

Because a server hands out the public key certificate and has the private key certificate, the client can encrypt all HTTP information using the public key and the server can decrypt it using the private key.

Thomas Porrin’s explanation of how SSL works is also excellent. For details, Wikipedia is comprehensive as always.

So far, so good. Next is JSSE.

Understanding JSSE

JSSE is complex. The reference guide and the crypto spec are surprisingly helpful (once I started to understand it), but it wasn’t until I had the source code handy and could look at the internal Sun JSSE classes that I felt I had a handle on it:

In addition, the following best practices guides were very helpful:

It’s important to note that Play supports JDK 1.6. and 1.6 came out in December 2006. That’s over eight years ago. Since then, TLS (and the attacks on TLS) have evolved. Where possible, I wanted to bring 1.6 up to the 1.7 level of functionality, or at least note where it lags.

There are several parts to the JSSE, but it all starts with SSLContext(source, impl). Without a correctly configured SSLContext, you have nothing.

1
2
3
val sslContext = SSLContext.getInstance(protocol)
sslContext.init(keyManagers, trustManagers, secureRandom)
return sslContext // correctly configured ssl context.

Wait, what? What’s a TrustManager? What’s a KeyManager? Well, from the JSSE Reference Guide (which is the first, last and frequently only word on the subject):

  • TrustManager: Determines whether the remote authentication credentials (and thus the connection) should be trusted.
  • KeyManager: Determines which authentication credentials to send to the remote host.

Most of the time, you’ll be working with a trust manager. You only need to worry about a key manager if you’re doing client authentication.

The interesting thing with this API, right off the bat, is that init() takes null parameters for defaults, and it takes an array of managers.

1
sslContext.init(null, null, null)

What the JSSE Reference Guide says is “installed security providers will be searched for the highest priority implementation of the appropriate factory”. What actually happens is that you get an empty key manager and a default X509TrustManagerImpl that points to cacerts.

Likewise, the API takes an array of key managers, so you would expect this to work:

1
sslContext.init(Array(keyManager1, keyManager2), null, null) // what could go wrong?

The problem here is that init() doesn’t compose or aggregate managers together. As the Javadoc says, “only the first instance of a particular key and/or trust manager implementation type in the array is used. (For example, only the first javax.net.ssl.X509KeyManager in the array will be used.)”

There is similar fineprint and tricksy assumptions throughout the JSSE API. If you don’t have the source code available, you will be utterly confused.

If you’re not using SSLContext, then changes are done by setting system properties. This isn’t a bad way per se, but it’s global and opaque to the API.

Direct unit testing is painful as half the classes are defined as final, or use static methods. The internal logic is frustratingly and needlessly tightly coupled.

Despite all of this, and despite having interfaces temptingly near, you should NEVER replace the underlying JSSE functionality. Augment it, sure. Subclass away. Filter out weak points. But doing a straight up rewrite is a mistake. As per Moxie Marlinspike:

If you’re interested in writing a more restrictive TrustManager implementation for Android, my recommendation is to have your implementation call through to the system’s default TrustManager implementation as the very first thing it does. . That way you can ensure you at least won’t be doing any worse than the default, even if there are vulnerabilities in the additional checks you do.

Guardian’s StrongTrustManager Vulnerabilities

Having read through TLS and JSSE, we’re now ready to check out how to configure the client.

Configuring a client

So, with the source code in hand, the first question was: how is WS?

It turns out that WS does the right thing. If you want to disable certificate validation, you have explicitly set the following in application.conf:

1
ws.acceptAnyCertificate = true

which will let you accept a self-signed certificate that has not been added to your trust store.

However, the way ws.acceptAnyCertificate is interesting. In Play 2.2.x, it looks like this:

1
2
3
if (!playConfig.flatMap(_.getBoolean("ws.acceptAnyCertificate")).getOrElse(false)) {
  asyncHttpConfig.setSSLContext(SSLContext.getDefault)
}

That’s it. There’s no other logic that involves telling AsyncHttpClient to accept any certificate anywhere else in Play.

It turns out that accepting any certificate is the default behavior in AsyncHttpClient. If you are making HTTPS calls in Java using AsyncHTTPClient 1.7.x directly, you are vulnerable to a MITM attack.

The SSLContext class is central to the SSL implementation in Java in general and in AsyncHttpClient in particular. The default SSLContext for AsyncHttpClient is dependent on whether the javax.net.ssl.keyStore system property is set. If this property is set, AsyncHttpClient will create a TLS SSLContext with a KeyManager based on the specified key store (and configured based on the values of many other javax.net.ssl properties as described in the JSSE Reference Guide linked above). Otherwise, it will create a TLS SSLContext with no KeyManager and a TrustManager which accepts everything. In effect, if javax.net.ssl.keyStore is unspecified, any ol’ SSL certificate will do.

SSL Certificate Verification in Dispatch and AsyncHttpClient

The first step in implementing HTTPS is to set up certificate verification to avoid issue 352. This, in itself, is fairly easy: just create an SSLContext instance, then init with null values.

1
2
3
4
5
val builder = new AsyncHttpConfig.Builder()
val sslContext = SSLContext.getInstance(protocol)
sslContext.init(null, null, null)
builder.setSSLContext(sslContext)
val asyncHttpClientConfig = builder.build()

But, of course, that was only the beginning.

The essential problem with ws.acceptAnyCertificate is while it’s wrong, it’s also a one line configuration setting. It’s obvious what it does. Meanwhile, the experience of adding a self signed certificate to the trust manager is downright painful. By default, the root CA certificates are in $JAVA_HOME/lib/security/cacerts and so if you want to add an extra certificate (rather than replace all the existing CA certs), you have to know the exact keystore command for it:

1
keytool -import -trustcacerts -file /path/to/ca/ca.pem -alias CA_ALIAS -keystore $JAVA_HOME/jre/lib/security/cacerts

Just from a deployment and maintenance perspective, this is a huge hassle. And it’s not like trust stores and keystores are all that complicated: there’s an list of certificates tied to aliases, with an optional password attached.

The simplest thing to do, from a programmer perspective, would be to have a list of stores that were pulled into a single manager. Then, instead of having to run a keytool command, you could just add a line saying where your store was, and you’d be done.

This involved creating a key manager and a trust manager that could take multiple stores. After looking through the source code, I determined that there was no API problem with using multiple stores inside a single manager… but then ran into the implementation again. In the X.509 implementation of JSSE, there’s a one to one correspondence between a manager and a store. I ended creating managers from the factories and then using a composite manager pattern based off Cody A. Ray’s blog post, and using the X509TrustManagerImpl implementation and the X509ExtendedTrustManager example as references.

The composite trust manager has a list of X509TrustManagerImpl, and iterates through each one until it finds one that doesn’t throw an exception. If all of them throw exceptions, then it rethrows the exception with the entire list (so that no exceptions are swallowed), otherwise, it returns the first good result. This extends the TrustManager functionality while safely keeping all of the existing logic in place.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
def checkServerTrusted(chain: Array[X509Certificate], authType: String): Unit = {
  var trusted = false
  val exceptionList = withTrustManagers {
    trustManager =>
      trustManager.checkServerTrusted(chain, authType)
      trusted = true
  }

  if (!trusted) {
    val msg = s"No trust manager was able to validate this certificate chain: # of exceptions = ${exceptionList.size}"
    throw new CompositeCertificateException(msg, exceptionList.toArray)
  }
}

private def withTrustManagers(block: (X509TrustManager => Unit)): Seq[Throwable] = {
  val exceptionList = ArrayBuffer[Throwable]()
  trustManagers.foreach {
    trustManager =>
      try {
        block(trustManager)
      } catch {
        case e: CertPathBuilderException =>
          logger.debug("No path found to certificate: this usually means the CA is not in the trust store", e)
          exceptionList.append(e)
        case e: GeneralSecurityException =>
          logger.debug("General security exception", e)
          exceptionList.append(e)
        case NonFatal(e) =>
          logger.debug("Unexpected exception!", e)
          exceptionList.append(e)
      }
  }
  exceptionList
}

Now you can now configure multiple key stores and trust stores directly in application.conf:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
ws.ssl {
  keyManager = {
    stores = [
      { type: "PKCS12", path: "keys/client.p12", password: "changeit2" },
      { type: "PEM", path: "keys/client.pem" }
    ]
  }

  trustManager = {
    stores = [
      { path: "keys/mystore.jks" },
      { path: ${java.home}/lib/security/cacerts }
    ]
  }
}

and end up with a properly configured key manager and trust manager that contain all the keys from the various stores.

There’s a lot more to key stores and trust stores than I’ve mentioned here. For more details (including how to resolve improperly configured certificate chains), see:

To muck with certificates inside a keystore, I recommend Keystore Explorer or java-keyutil.

Configuring multiple clients

So now we have a configuration. But there’s another problem. There’s only one application.conf file, and all the WS methods are on the companion object:

1
WS.url("https://google.com")

This meant that if you have several web services, say “secure.com” and “loose.com”, you cannot set up different configuration profiles for them, or set up a client dynamically, or do isolated testing. Everything had to be handled when the Play configuration loads.

I broke apart the WS.client and added a WSClient trait that could call url in the same way. Now you can do this:

1
2
3
4
5
6
7
8
9
10
11
12
import com.typesafe.config.ConfigFactory
import play.api.libs.ws._
import play.api.libs.ws.ning._

val configuration = play.api.Configuration(ConfigFactory.parseString(
  """
    |ws.ssl.trustManager = ...
  """.stripMargin))
val parser = new DefaultWSConfigParser(configuration)
val builder = new NingAsyncHttpClientConfigBuilder(parser.parse())
val secureClient : WSClient = new NingWSClient(builder.build())
val response = secureClient.url("https://secure.com").get()

and have much finer grained control over the TLS configuration.

Unfortunately, getting a client passed as an implicit parameter to WS.url is harder. I added a magnet pattern so you can do this:

1
2
3
4
5
6
7
8
9
10
11
12
13
object PairMagnet {
  implicit def fromPair(pair: Pair[WSClient, java.net.URL]) =
    new WSRequestHolderMagnet {
      def apply(): WSRequestHolder = {
        val (client, netUrl) = pair
        client.url(netUrl.toString)
      }
   }
}
import scala.language.implicitConversions
val client = WS.client
val exampleURL = new java.net.URL("http://example.com")
WS.url(client -> exampleURL).get()

and added another method WS.clientUrl that takes an implicit client:

1
2
implicit val sslClient = new play.api.libs.ws.ning.NingWSClient(sslBuilder.build())
WS.clientUrl("http://example.com/feed")

(Since this was first written, JDK 1.8 came out and you can now specify multiple key stores in a file using JEP-166.)

Debugging a client

While I was going through the client, I figured I may as well make it easier to turn on and off debugging as well.

Debugging is done by setting a system property, i.e. -Djavax.net.debug="ssl". Debugging output is written directly to System.out.println(), and the recommended way to change this is to change System.out. I can only hope this changes in JDK 1.8 — at the very least it should use java.util.logging — but it’s what there is for now.

I had the JSSE debug page and the debug section of the reference guide handy, so it was fairly simple to provide that in configuration rather than futz with system properties. I added certpath and “ocsp” (an undocumented debug property) as well, while I was checking for certificate validation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
ws.ssl.debug = [
 "certpath", "ocsp",

 # "all "  # defines all of the below
 "ssl",
 "defaultctx",
 "handshake",
   "verbose",
   "data",
 "keygen",
 "keymanager",
 "pluggability",
 "record",
   "packet",
   "plaintext",
 "session",
 "sessioncache",
 "sslctx",
 "trustmanager"
]

This is not a perfect solution, because system properties are global across all clients. But it’s better.

ADDENDUM: this only worked intermittently and eventually I figured out why and fixed it. The more sensitive among you may wish to avoid this link.

Next, it was time to figure out what went into the client. The most important thing is the protocol.

Choosing a protocol

TLS comes in different versions. In JSSE, the list is available here.

There are two calls that refer directly to the protocol names, the getInstance call:

1
val sslContext = SSLContext.getInstance("TLS") // or "TLSv1.2"

and the enabledProtocols list, which shows what the SSL context is willing to accept:

1
val enabledProtocols = sslContext.getDefaultParameters().getEnabledProtocols()

SSLv2 and SSLv2Hello (there is no v1) are obsolete and usage in the field is down to 25% on the public Internet. SSLv3 is known to have security issues and is still out in the field with 100% support. Virtually all HTTPS servers support it, and Mozilla Firefox still uses SSLv3 by default. They have a number of security issues compared to TLS.

TLSv1.2 is the current version, but early implementations of TLS 1.2 were prone to misconfiguration, which resulted in TLS 1.2 being disabled for the client by default in 1.7. Mozilla Firefox also has TLSv1.2 disabled, as of January 2014, and is only enabling it in the next version.

However, virtually all servers support TLS v1.0, and given our use cases, we expect that web services will have TLSv1.2 configured correctly. TLS 1.0 has been described as broken from the BEAST attack, but the attack seems to apply only to CBC ciphers, which we’re not obliged to use.

We want people to use the highest possible version of TLS. So we specify “TLSv1.2”, “TLSv1.1”, “TLSv1” in that order for JDK 1.7. For JDK 1.6, only “TLSv1” is available, so that’s what we have. We throw an exception on “SSLv3”, “SSLv2” and “SSLv2Hello”. If you don’t want that, then you have to explicitly set a ws.ssl.loose.allowWeakProtocols flag.

You can also specify the default protocol and the protocols list explicitly, i.e. if you want to configure JSSE for Suite B:

1
2
3
4
5
6
ws.ssl.protocol = "TLSv1.2" // passed into SSLContext.getInstance()

// passed into sslContext.getDefaultParameters().setEnabledProtocols()
ws.ssl.enabledProtocols = [
  "TLSv1.2"
]

(Since this was written, JDK 1.8 came out and you can now set the system property “jdk.tls.client.protocols” to enable protocols.)

Next, it’s time to figure out what cipher suite to pick.

Choosing a cipher suite

A cipher suite is really four different ciphers in one, describing the key exchange, bulk encryption, message authentication and random number function. In this particular case, we’re focusing on the bulk encryption cipher.

Recommended Cipher Suites

The JSSE list of cipher suites is here and there is an extensive comparison list. There’s a number of different ciphers available, and the list has changed substantially between JDK 1.7 and JDK 1.6.

In 1.8, the cipher list is ideal.

In 1.7, the default cipher list is reportedly pretty good.

In 1.6, the default list is out of order — some of the weaker ciphers show up before the stronger ciphers do. Not only that, but 1.6 has no support for Elliptic Curve cryptography (ECC) ciphers, which are much stronger and allow for perfect forward secrecy.

Now, the client doesn’t control what cipher will eventually be used. The server does. As a client, there are two things that you can do:

  1. You can present a list of ciphers which you are willing to accept.
  2. You can refuse a cipher which you know to be weak.

In 1.7, we use the default cipher list.

For 1.6, the client provides a truncated cipher list based off Brian Smith’s list, with the ECC ciphers taken out and the 3DES cipher removed. Roughly 55% of the Internet uses RC4, and given that WS is a web services client, it will probably be talking to only a few services which are current.

1
2
3
4
5
6
7
8
9
10
val java16RecommendedCiphers: Seq[String] = Seq(
  "TLS_DHE_RSA_WITH_AES_256_CBC_SHA",
  "TLS_DHE_RSA_WITH_AES_128_CBC_SHA",
  "TLS_DHE_DSS_WITH_AES_128_CBC_SHA",
  "TLS_RSA_WITH_AES_256_CBC_SHA",
  "TLS_RSA_WITH_AES_128_CBC_SHA",
  "SSL_RSA_WITH_RC4_128_SHA",
  "SSL_RSA_WITH_RC4_128_MD5",
  "TLS_EMPTY_RENEGOTIATION_INFO_SCSV" // per RFC 5746
)

This isn’t the only possible option. There is an IETF recommended list of cipher suites:

  • TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
  • TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
  • TLS_DHE_RSA_WITH_AES_128_GCM_SHA256
  • TLS_DHE_RSA_WITH_AES_256_GCM_SHA384

and suggests TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 as preferred in general.

Deprecated Cipher Suites

There are some ciphers which everyone agrees are bad, and should never be used. NULL. Export Suite. DES. Anon. They are disabled by default, and the JSSE team says You are NOT supposed to use these cipher suites.

This brings up the next question: should WS consider RC4 and MD5 based ciphers to be weak? Surprisingly, probably not.

If you are setting up a server, you shouldn’t use RC4 and MD5 in your cipher suites, certainly. But if you’re a client, you’re probably fine talking to a server that is using RC4 or MD5.

In the case of RC4:

RC4 is horribly broken, and is horribly broken in ways that are meaningful to TLS. But the magnitude of RC4’s brokenness wasn’t appreciated until last year, and up until then, RC4 was a common recommendation for resolving both the SSL3/TLS1.0 BEAST attack and the TLS “Lucky 13” M-t-E attack. That’s because RC4 is the only widely-supported stream cipher in TLS. Moreover, RC4 was considered the most computationally efficient way to get TLS deployed, which 5-6 years ago might have been make-or-break for some TLS deployments. You should worry about RC4 in TLS —– but not that much: the attack is noisy and extremely time consuming. You should not be alarmed by MD5 in TLS, although getting rid of it is one of many good reasons to drive adoption of TLS 1.2.

Thomas H. Ptacek

and:

The best, known attack against using RC4 with HTTPS involves causing a browser to transmit many HTTP requests — each with the same cookie — and exploiting known biases in RC4 to build an increasingly precise probability distribution for each byte in a cookie. However, the attack needs to see on the order of 10 billion copies of the cookie in order to make a good guess. This involves the browser sending ~7TB of data. In ideal situations, this requires nearly three months to complete.

A roster of TLS cipher suites weaknesses

The case of MD5:

The MD5 hash function is broken, that is true. However, TLS doesn’t use MD5 in its raw form; it uses variants of HMAC-MD5, which applies the hash function twice, with two different padding constants with high Hamming distances (put differently, it tries to synthesize two distinct hash functions, MD5-IPAD and MD5-OPAD, and apply them both). Nobody would recommend HMAC-MD5 for use in a new system, but it has not been broken.

Thomas H. Ptacek

and:

The attacks on HMAC-MD5 do not seem to indicate a practical vulnerability when used as a message authentication code.

RFC 6151, section 2.3

Note that we are talking about use of MD5 in a cipher here — as a client, accepting an MD5 signed certificate is a different kettle of fish.

Disabling Deprecated Cipher Suites

The jdk.tls.disabledAlgorithms security property in 1.7 works fine to exclude bad or weak ciphers, and can also check small key sizes in a handshake. In particular, before 1.8, ephemeral DH parameters (DHE) were limited to 1024 bits, which is considered weak these days (although apparently still inherently stronger than RSA keys). You may also need to do this if you’re on 1.7, as there’s a nasty bug in 1.7 that causes connections to fail 0.05% of the time — although frankly, upgrading to 1.8 is a much better solution, just so you can specify “-Djdk.tls.ephemeralDHKeySize=2048” and get both perfect forward secrecy and a decent key size.

jdk.tls.disabledAlgorithms is a security property, not a system property, and is null by default:

1
2
scala> java.security.Security.getProperty("jdk.tls.disabledAlgorithms")
res9: String = null

The classes that use jdk.tls.disabledAlgorithms is TLSDisabledAlgConstraints, defined as static and final. There is no reliable and safe of setting the property dynamically in code — once the class has loaded, that’s what you’ve got.

1
2
private final static AlgorithmConstraints tlsDisabledAlgConstraints =
            new TLSDisabledAlgConstraints();

Instead, you must set it in a properties file:

1
2
# disabledAlgorithms.properties
jdk.tls.disabledAlgorithms=DHE, ECDH, ECDHE, RSA, DSA, EC

The parameters to use are not immediately obvious, but are listed in the Providers document and from the code itself.

Once you’re done, reference that file from the command line using the undocumented java.security.properties system property:

1
java -Djava.security.properties=disabledAlgorithms.properties

Note that you will only be able to use ECC algorithms if you are on an Oracle JDK.

Another option is to set up the constraints on an SSLParameters object from setAlgorithmConstraints:

1
2
3
4
val sslParameters = sslContext.getDefaultSSLParameters() // clones new instance from default
val sslEngine = sslContext.createSSLEngine(peerHost, peerPort)
sslParameters.setAlgorithmConstraints(algConstraints)
sslEngine.setSSLParameters(sslParameters)

which looks much more convenient from a configuration perspective… but there’s a problem. The disabled algorithms filter is not supported in 1.6. Play supports 1.6, so if we want this feature, we have to do something else.

We can’t check the server handshake at runtime for the cipher, but we can cheat: we can check the SSLContext’s enabledCiphers list. We check the cipher list at configuration time, and throw an exception if we find that there is a weak cipher in the list. If you want to turn off the check, you have to configure the ws.ssl.loose.acceptWeakCiphers flag.

I don’t think that ciphers need to be checked at run time, as I don’t think that the client will accept a cipher from the server that is not already in the client list.

As with protocols, you can configure the cipher list by hand:

1
2
3
4
ws.ssl.enabledCiphers = [
  "TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384"
  "TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256"
]

If you have the option, you probably want to set jdk.tls.disabledAlgorithms anyway: I don’t know of a way to check to ensure a minimum key size in the server handshake without using jdk.tls.disabledAlgorithms.

And that about wraps things up for cipher suites.

Next

X.509 Certificates!

Building a Development Environment With Docker

| Comments

TL;DR

I’ve written a cheat sheet for Docker, and I have a github project for it. Here’s the thinking that went into why Docker, and how best to use it.

The problem

You want to build your own development environment from scratch, and you want it to be as close to a production environment as possible.

Solutions

Development environments usually just… evolve. There’s a bunch of tries at producing a consistent development environment, even between developers. Eventually, through trial and error, a common set of configuration files and install instructions turns into something that resembles a scaled down and testable version of the production environment, managed through version control and a set of bash scripts.

But even when it gets to that point, it’s not over, because modern environments can involve dozens of different components, all with their own configuration, often times communicating with each other through TCP/IP or even worse, talking to a third party API like S3. To replicate the production environment, these lines of communication must be drawn — but they can’t be squashed into one single machine. Something has to give.

Solution #1: Shared dev environment

The first solution is to set up a environment with exactly the same machines in the same way as production, only scaled down for development. Then, everyone uses it.

This works only if there is no conflict between developers, and resource use and contention is not a problem. Oh, and you don’t want to swap out one of those components for a particular team.

If you need to access the environment from outside the office, you’ll need a VPN. And if you’re on a flaky network or on a plane, you’re out of luck.

Solution #2: Virtual Machines

The second solution is to put as much of the environment as possible onto the developer’s laptop.

Virtual Machines such as VirtualBox will allow you to create an isolated dev environment. You can package VMs into boxes with Vagrant, and create fresh VMs from template as needed. They each have their own IP address, and you can get them to share filesystems.

However, VMs are not small. You can chew up gigabytes very easily providing the OS and packages for each VM, and those VMs do not share CPU or memory when running together. If you have a complex environment, you will run into a point where you either run out of disk space or memory, or you break down and start packaging multiple components inside a single VM, producing an environment which may not reflect production and is far more fragile and prone to complexities.

Solution #3: Docker

Docker solves the isolation problem. Docker provides (consistent, reproducible, disposable) containers that make components appear to be running on different machines, while sharing CPU and memory underneath, and provides TCP/IP forwarding and filesystems that can be shared between containers.

So, here’s how you build a development environment in Docker.

Docker Best Practices

Build from Dockerfile

The only sane way to put together a dev environment in Docker is to use raw Dockerfile and a private repository. Pull from the central docker registry only if you must, and keep everything local.

Chef recipes are slow

You might think to yourself, “self, I don’t feel like reinventing the wheel. Let’s just use chef recipes for everything.”

The problem is that creating new containers is something that you’ll do lots. Every time you create a container, seconds will count, and minutes will be totally unacceptable. It turns out that calling apt-get update is a great way to watch nothing happen for a while.

Use raw Dockerfile

Docker uses a versioned file system called AUFS, which identifies commands it can run from layers (aka cached fs) and pulls out the appropriate version. You want to keep the cache happy. You want to put all the mutable stuff at the very end of the Dockerfile, so you can leverage cache as much as possible. Chef recipes are a black box to Docker.

The way this breaks down is:

  1. Cache wins.
  2. Chef, ansible, etc, does not use cache.
  3. Raw Dockerfile uses cache.
  4. Raw Dockerfile wins.

There’s another way to leverage Docker, and that’s to use an image that doesn’t start off from ubuntu or basebox. You can use your own base image.

The Basics

Install a internal docker registry

Install an internal registry (the fast way) and run it as a daemon:

1
docker run -name internal_registry -d -p 5000:5000 samalba/docker-registry

Alias server to localhost:

1
echo "127.0.0.1      internal_registry" >> /etc/hosts

Check internal_registry exists and is running on port 5000:

1
2
apt-get install -y curl
curl --get --verbose http://internal_registry:5000/v1/_ping

Install Shipyard

Shipyard is a web application that provides an easy to use interface for seeing what Docker is doing.

Open up a port in your Vagrantfile:

1
config.vm.network :forwarded_port, :host => 8005, :guest => 8005

Install shipyard from the central index:

1
2
3
4
5
SHIPYARD=$(docker run \
    -name shipyard \
  -p 8005:8000 \
  -d \
  shipyard/shipyard)

You will also need to replace /etc/init/docker.conf with the following:

1
2
3
4
5
6
7
8
9
10
description "Docker daemon"

start on filesystem and started lxc-net
stop on runlevel [!2345]

respawn

script
        /usr/bin/docker -d -H tcp://0.0.0.0:4243 -H unix:///var/run/docker.sock
end script

THen reboot the VM.

Once the server has rebooted and you’ve waited for a bit, you should have shipyard up. The credentials are “shipyard/admin”.

  • Go to http://localhost:8005/hosts/ to see Shipyard’s hosts.
  • In the vagrant VM, ifconfig eth0 and look for “inet addr:10.0.2.15” — enter the IP address.

Create base image

  • Create a Dockerfile with initialization code such as `apt-get update / apt-get install’ etc: this is your base.
  • Build your base image, then push it to the internal registry with docker build -t internal_registry:5000/base .

Build from your base image

Build all of your other Dockerfile pull from “base” instead of ubuntu.

Keep playing around until you have your images working.

Push your images

Push all of your images into the internal registry.

Save off your registry

if you need to blow away your Vagrant or set someone else up, it’s much faster to do it with all the images still intact:

1
2
3
docker export internal_registry > internal_registry.tar
gzip internal_registry.tar
mv internal_registry.tar.gz /vagrant

Tips

  • docker add blows away the cache, don’t use it (bug, possibly fixed).
  • There’s a limit to the number of layers you can have, pack your apt-get onto a single line.
  • Keep common instructions at the top of the Dockerfile to leverage the cache as long as possible.
  • Use tags when building (Always pass the -t option to docker build).
  • Never map the public port in a Dockerfile.

Exposing Services

If you are running a bunch of services in Docker and want to expose them through Virtualbox to the host OS, you need to do something like this in your Vagrant:

1
2
3
(49000..49900).each do |port|
  config.vm.network :forwarded_port, :host => port, :guest => port
end

Let’s start up Redis:

1
2
docker pull johncosta/redis
docker run -p 6379 -d johncosta/redis

Then find the port:

1
2
docker ps
docker port <redis_container_id> 6379

Then connect to the 49xxx port that Virtualbox exposes.

Cleanup

1
docker ps -a | grep 'weeks ago' | awk '{print $1}' | xargs docker rm

eliminate:

1
docker rm `docker ps -a -q`

Running from an existing volume

1
docker run -i -t -volumes-from 5ad9f1d9d6dc mytag /bin/bash

Sources

Play in Practice

| Comments

I gave a talk on Play in Practice at the SF Scala meetup recently. Thanks to Stackmob for hosting us and providing pizza.

I went into describing how to implementing CQRS in Play, but there was a fairly long question and answer section about Play as well. I couldn’t go into detail on some of the answers and missed some others, so I’ll fill in the details here.

Video

Slides

Core API

The core API is Action, which take in a Request and return a Result. The Request is immutable, but you can wrap it with extra information, which you’ll typically do with action composition. 2.1.1 introduced EssentialAction, which uses (RequestHeader => Iteratee[Array[Byte], Result]) instead of Action’s (Request => Result) and makes building Filters easier.

Again, Play’s core is simple. About as simple as you can get.

Streaming

Streaming is handled by Iteratees, which can be a confusing topic for many people. There are good writeups here and here. lila is the best application to look at for streaming, especially for sockets and hubs.

Having good streaming primitives is something that I didn’t get into that much in the talk, but is still vitally important to “real time web” stuff.

Filters

If you want to do anything that you’d consider as part of a “servlet pipeline”, you use Filters, which are designed to work with streams.

An example of a good Filter is to automatically uncompress an asset — here’s an example that uses an Enumeratee:

1
2
3
4
5
6
7
8
9
10
11
class GunzipFilter extends EssentialFilter {
  def apply(next: EssentialAction) = new EssentialAction {
    def apply(request: RequestHeader) = {
      if (request.headers.get("Content-Encoding").exists(_ == "gzip")) {
        Gzip.gunzip() &>> next(request)
      } else {
        next(request)
      }
    }
  }
}

Note that this only does uncompression: Automatic streaming gzip compression of templates is not available “out of the box” in 2.1.2, but it should be available in Play 2.2.

Templating

Play comes packaged with its own template language, Twirl, but you’re not required to use it. There is an integration into Scalate that gives you Mustache, Jade, Scaml and SSP. There’s also an example project that shows how to integrate Play with Freemarker.

One thing that Play doesn’t address directly is how to set up a structure for page layouts. Play provides you with index.scala.html and main.scala.html, but doesn’t provide you with any more structure than that. If you set up a header and footer and allow for subdirectories to use their own templates, you can minimize the amount of confusion in the views.

There’s an example in RememberMe, and this is the approach that lila takes as well.

Another thing is that Play’s default project template is intentionally minimal. If you use Backbone and HTML5 templates, then a custom giter8 template like mprihoda/play-scala may suit you better.

JSON

Play’s JSON API is very well done, and is a great way to pass data around without getting into the weeds or having to resort to XML. It goes very well with case classes.

The documentation isn’t bad, but Pascal Voitot (the author of play-json) has a series of blog posts that go the extra mile: reading JSON with JsPath, writing JSON formats, transforming JSON, and even defining JSON macros.

Forms

Form handling is one of those things that is never intuitive for me. The documentation helps, but really if you want to know how to do validation, using the sample forms application is the best way to pick things up. There are many useful nuggets that aren’t explicitly discussed in the documentation. In particular, the ability to make custom constraints is extremely useful.

Routing

There’s only one routing API replacement that I know of, Play Navigator, a routing DSL for REST services. However, you can use custom data types in the routing table using QueryStringBindable and PathBindable, and save yourself some “string2foo” conversion.

Asynchronous Operation

Talking about Akka (and the other async code) in Play is tricky for a couple of reasons.

The first reason is that “async” involves a number of different concepts, all of which are complex and worthy of blog posts in themselves. Sadek Drobi gives a nice overview, and there’s an exhaustive mailing list discussion about the asynchronous code in Play works.

The second bit of trickiness is that Play 2.0 and Play 2.1 async features do not work in the quite the same way.

Play 2.0 uses Akka for almost everything internally.

Play 2.1 does not use Akka to handle incoming requests, or iteratees, or internal code. It uses scala.concurrent.Future instead with its own thread pools.

Play 2.1 also uses a default thread pool, which is Akka backed — ActorSystem("play") — and is used for the application code, i.e. the stuff inside Action.

This is important, because blog posts like James Ward’s Optimizing Play 2 for Database Driven Apps are only applicable to Play 2.0, not 2.1. For 2.1, use the thread pools documentation.

In addition to the “play” actor system, there’s a Play Akka plugin. The Akka plugin is actually packaged with Play itself, and you can find it under play.api.libs.concurrent.Akka.

So, if Play already uses Akka under the hood, then why define an Akka plugin?

I believe it’s because the Akka plugin defines a distinct ActorSystem("application") that can be used for backend tasks like sending email, and can be configured without impacting the “play” ActorSystem. The Akka plugin provides a useful default and enforces seperation between Play’s actors and the application’s actors.

CQRS

Given that most of the CQRS talks I’ve read have been from the enterprise perspective, it was nice to talk about CQRS in the context of functional programming and statelessness.

Message passing is something that is typically mentioned in inter process communication, or in message oriented middleware. Akka — a message passing architecture on the thread level — allows us to build “zero coupling” systems . As message passing patterns, CQRS and DDD are a good set of idioms to think about domain logic together, especially since they already assume eventual consistency and indeterminate time.

Authentication

If you’re using Scala, there are two good authentication options, RememberMe (ahem) and SecureSocial. SecureSocial has better documentation and has been around longer, but RememberMe has better security resistance to some attacks. I’m working to integrate RememberMe’s functionality into SecureSocial, but you’ll want to check out both of them.

There’s also a pure Java authentication option: Play Authenticate. I haven’t used this, but the code looks reasonable.

If you’d rather go it alone or need a basic starter application, you may find Play20StartApp useful (password reset, account confirmation, etc.)

Authorization

Deadbolt 2 is the best known authorization framework. You can use things like Shiro, but you’re better off with something specifically designed for Play.

Security

Play does fairly well on security compared to other frameworks. For example, it will set a CORS header to protect against clickjacking, will sign the session cookie with an HMAC to protect against broken authentication, supports SSL, etc.

However, there are some things that Play doesn’t do.

Play doesn’t encrypt the session cookie, so you shouldn’t store any sensitive information in there.

Play won’t protect you from replay attacks, as Play is stateless by default. You can specify a nonce or request counter to counteract this, and RememberMe uses a token based approach for persistent login cookies.

Play won’t protect you against injection attacks. You can specify value classes to validate your input against raw strings.

Play won’t protect you against security misconfiguration. You should have a release checklist.

Play won’t protect you from insecure cryptography practices. Education helps, but there’s a lot of misinformation out there as well; watch this video (and slides) and be wary of things you read on Stack Overflow and Hacker News.

Play won’t protect you from failure to restrict URL access; that’s up to the authorization framework.

Play does have cross site request forgery protection, but it will only be effective if you enable the filter and explicitly pass the CSRF helper function in through every single form. There is an authenticity token approach as well, though I haven’t used it.

Most importantly, Play won’t tell you about how web application security fails. I recommend The Tangled Web as an excellent overview on how web applications are stitched together out of different technologies, and how to secure them.

Logging

The underlying logger for Play is Logback. Logback is one of the few hardcoded dependencies in Play, which has caused some issues. Fortunately, Play uses Logback through the SLF4J logging API, but there’s no option built into Play to allow Logback to be swapped out easily. There are reports of people swapping out Logback for other logging frameworks, but I haven’t tried them.

There have also been issues with the logging configuration conflicting in places or being unclear. One thing that has tripped people up repeatedly is that all the logging configuration must be done in one place. You can’t have some logging configuration in application.conf and some configuration in logger.xml.

While Play uses SLF4J under the hood, it doesn’t expose SLF4J functionality in play.api.Logger. In fact, there are only two method signatures for logging:

1
2
  def error(message: => String) : Unit
  def error(message: => String, error: => Throwable) : Unit

This doesn’t really cover the way I like to log, and it doesn’t provide even the features that are available in SLF4J, such as parameterized logging. My own answer was to ignore the Play logging API entirely and write a Logging wrapper directly against SLF4J (with kestrel combinators, natch), but you may want to use something out of the box.

For example, Typesafe Logging, uses SLF4J and provides you with this:

1
2
3
4
5
6
  def error(message: String): Unit
  def error(message: String, params: AnyRef*): Unit
  def error(message: String, t: Throwable): Unit
  def error(marker: Marker, message: String): Unit
  def error(marker: Marker, message: String, params: AnyRef*): Unit
  def error(marker: Marker, message: String, t: Throwable): Unit

Or you can use loglady, which uses the Python API style with printf syntax:

1
2
  def error(message: String,  params: Any*) : Unit
  def error(thrown: Throwable, message: String,  params: Any*) : Unit

WAR packaging

I said in the Q&A that I didn’t think you could package Play 2 applications as WAR files. Well, it turns out that there is a plugin available, and it works with Servlet 3.0 and 2.5 containers (Tomcat 6/7, Jetty 7/8/9, JBoss 5/6/7, etc). You may need to tweak the logger to work in the container correctly.

I don’t know how Play’s performance is affected by running inside a servlet container; let me know if it works for you.

Asset Packaging

Javascript assets in Play are minified using Google Closure — this happens automatically on play dist. They also can be gzipped using a custom SBT script.

This makes a good enough solution for most people. If you are really intent on minimizing your asset overhead, you should consider putting your assets on a static file server backed by HAProxy, or putting them on CDN.

Email

Email is one of those things that I think should be divorced as much as possible from Play. It’s backend and async by nature, and this makes it something that is best handled through Akka.

akka-email is available on Github and gives you a starting place to build up a message passing infrastructure for email.

Metrics

Instrumenting applications is important. Sadly, every metrics solution has its own API, so you can’t easily switch between them. However, there’s no shortage of options.

  • New Relic recently came out with support for Play 2.
  • Ostrich, the Twitter metrics library.
  • Metrics, with the metrics-scala from Erik Van Oosten, cross-compiled for multiple versions. This is what I use.
  • Pillage, which has a Scala option (I have not tried this).
  • statsd module for Play 2.

The Typesafe Console is the best monitoring tool to use if you are using Akka, but that depends on having a Typesafe subscription if you want to use it in production.

Load and Stress Testing

Determining a load plan is hard, and involves some amount of educated guessing. Fortunately, most applications simply don’t get that much load, even ones you’d think would be busy.

Gatling and wrk are good ways of stressing a system, but they don’t reflect normal user behavior. Apache JMeter is very good at modelling random user behavior, but is clunky. A good and arguably the most realistic load test is to hire a couple of hundred users from Mechanical Turk to pound on the site at once, but this may not be very convenient.

Deployment

There are a number of different ways to deploy Play projects. Using play dist gets you most of the way, but you may want to deploy with Ansible or Chef or Fabric. Or you can use upstart or even git hooks.

If you just want to push changes to a staging server as they happen, you can do this with rsync -avz --delete -e ssh $deployed_code staging:/opt/play-app, although this isn’t so great for production.

Java Support

The Java and Scala APIs are very similar. However, there are a couple of notable differences, which come out of Java’s lack of closure support:

  • The Java API does not support Iteratees.
  • The Java API does not have an implicit execution context.

The play.libs.F library goes a fair way to providing Scala’s functional programming constructs in Java.

More?

If you have suggestions or want to point something out, please email me at will.sargent@gmail.com, and I’ll fill out this post with more details.

Error Handling in Scala

| Comments

The previous post was mostly about programming “in the small” where the primary concern is making sure the body of code in the method does what it’s supposed to and doesn’t do anything else. This blog post is about what to do when code doesn’t work — how Scala signals failure and how to recover from it, based on some insightful discussions.

First, let’s define what we mean by failure.

  • Unexpected internal failure: the operation fails as the result of an unfulfilled expectation, such as a null pointer reference, violated assertions, or simply bad state.
  • Expected internal failure: the operation fails deliberately as a result of internal state, i.e. a blacklist or circuit breaker.
  • Expected external failure: the operation fails because it is told to process some raw input, and will fail if the raw input cannot be processed.
  • Unexpected external failure: the operation fails because a resource that the system depends on is not there: there’s a loose file handle, the database connection fails, or the network is down.

Java has one explicit construct for handling failure: Exception. There’s some difference of usage in Java throughout the years — IO and JDBC use checked exceptions throughout, while other API like org.w3c.dom rely on unchecked exceptions. According to Clean Code, the best practice is to use unchecked exceptions in preference to checked exceptions, but there’s still debate over whether unchecked exceptions are always appropriate.

Exceptions

Scala makes “checked vs unchecked” very simple: it doesn’t have checked exceptions. All exceptions are unchecked in Scala, even SQLException and IOException.

The way you catch an exception in Scala is by defining a PartialFunction on it:

1
2
3
4
5
6
7
8
9
10
11
12
val input = new BufferedReader(new FileReader(file))
try {
  try {
    for (line <- Iterator.continually(input.readLine()).takeWhile(_ != null)) {
      Console.println(line)
    }
  } finally {
    input.close()
  }
} catch {
  case e:IOException => errorHandler(e)
}

Or you can use control.Exception, which provides some interesting building blocks. The docs say “focuses on composing exception handlers”, which means that this set of classes supplies most of the logic you would put into a catch or finally block.

1
2
3
Exception.handling(classOf[RuntimeException], classOf[IOException]) by println apply {
  throw new IOException("foo")
}

Using the control.Exception methods is fun and you can string together exception handling logic to create automatic resource management, or an automated exception logger. On the other hand, it’s full of sharp things like allCatch. Leave it alone unless you really need it.

Another important caveat is to make sure that you are catching the exceptions that you think you’re catching. A common mistake (mentioned in Effective Scala) is to use a default case in the partial function:

1
2
3
4
5
try {
  operation()
} catch {
  case _ => errorHandler(e)
}

This will catch absolutely everything, including OutOfMemoryError and other errors that would normally terminate the JVM.

If you want to catch “everything” that would normally happen, then use NonFatal:

1
2
3
4
5
6
7
import scala.util.control.NonFatal

try {
  operation()
} catch {
  case NonFatal(exc) => errorHandler(e)
}

Exceptions don’t get mentioned very much in Scala, but they’re still the bedrock for dealing with unexpected failure. For unexpected internal failure, there’s a set of assertion methods called require, assert, and assume, which all use throwables under the hood.

Option

Option represents optional values, returning an instance of Some(A) if A exists, or None if it does not. It’s ubiquitous in Scala code, to the point where it fades into invisibility. The cheat sheet is the best way to get a handle on it.

It’s almost impossible to use Option incorrectly, but there is one caveat: Some(null) is valid. If you have code that returns null, wrap it in Option() to convert it:

1
val optionResult = Option(null) // optionResult is None.

Either

Either is a disjoint union construct. It returns either an instance of Left[L] or an instance of Right[R]. It’s commonly used for error handling, where by convention Left is used to represent failure and Right is used to represent success. It’s perfect for dealing with expected external failures such as parsing or validation.

1
2
3
4
5
6
7
8
9
10
case class FailResult(reason:String)

def parse(input:String) : Either[FailResult, String] = {
  val r = new StringTokenizer(input)
  if (r.countTokens() == 1) {
    Right(r.nextToken())
  } else {
    Left(FailResult("Could not parse string: " + input))
  }
}

Either is like Option in that it makes an abstract idea explicit by introducing an intermediate object. Unlike Option, it does not have a flatMap method, so you can’t use it in for comprehensions — not safely at any rate. You can use a left or right projection if you’re not interested in handling failure:

1
val rightFoo = for (outputFoo <- parse(input).right) yield outputFoo

More typically, you’ll use fold:

1
2
3
4
parse(input).fold(
  error => errorHandler(error),
  success => { ... }
)

You’re not limited to using Either for parsing or validation, of course. You can use it for CQRS.

1
2
3
4
case class UserFault
case class UserCreatedEvent

def createUser(user:User) : Either[UserFault, UserCreatedEvent]

or arbitary binary choices:

1
def whatShape(shape:Shape) : Either[Square, Circle]

Either is powerful, but it’s trickier than Option. In particular, it can lead to deeply nested code. It can also be misunderstood. Take the following Java lookup method:

1
public Foo lookup(String id) throws FooException // throw if not found or db exception

Scala has Option, so we can use that. But what if the database goes down? Using the error reporting convention of Either might suggest the following:

1
def lookup() : Either[FooException,Option[Foo]]

But this is awkward. If you return Either because something might fail unexpectedly, then immediately half your API becomes littered with Either[Throwable, T].

Ah, but what if you’re modifying a new object?

1
def modify(inputFoo:Foo) : Either[FooException,Foo]

If you’re dealing with expected failure and there’s good odds that the operation will fail, then returning Either is fine: create a case class representing failure FailResult and use Either[FailResult,Foo].

Don’t return exceptions through Either. If you want a construct to return exceptions, use Try.

Try

Try is similar to Either, but instead of returning any class in a Left or Right wrapper, it returns Failure[Throwable] or Success[T]. It’s an analogue for the try-catch block: it replaces try-catch’s stack based error handling with heap based error handling. Instead of having an exception thrown and having to deal with it immediately in the same thread, it disconnects the error handling and recovery.

Try can be used in for comprehensions: unlike Either, it implements flatMap. This means you can do the following:

1
2
3
4
5
6
val sumTry = for {
  int1 <- Try(Integer.parseInt("1"))
  int2 <- Try(Integer.parseInt("2"))
} yield {
  int1 + int2
}

and if there’s an exception returned from the first Try, then the for comprehension will terminate early and return the Failure.

You can get access to the exception through pattern matching:

1
2
3
4
5
6
7
8
sumTry match {
  case Failure(thrown) => {
    Console.println("Failure: " + thrown)
  }
  case Success(s) => {
    Console.println(s)
  }
}

Or through failed:

1
2
3
if (sumTry.isFailure) {
  val thrown = sumTry.failed.get
}

Try will let you recover from exceptions at any point in the chain, so you can defer recovery to the end:

1
2
3
4
5
6
7
8
val sum = for {
  int1 <- Try(Integer.parseInt("one"))
  int2 <- Try(Integer.parseInt("two"))
} yield {
  int1 + int2
} recover {
  case e => 0
}

Or recover in the middle:

1
2
3
4
5
6
val sum = for {
  int1 <- Try(Integer.parseInt("one")).recover { case e => 0 }
  int2 <- Try(Integer.parseInt("two"))
} yield {
  int1 + int2
}

There’s also a recoverWith method that will let you swap out a Failure:

1
2
3
4
5
6
7
8
val sum = for {
  int1 <- Try(Integer.parseInt("one")).recoverWith {
    case e: NumberFormatException => Failure(new IllegalArgumentException("Try 1 next time"))
  }
  int2 <- Try(Integer.parseInt("2"))
} yield {
  int1 + int2
}

You can mix Either and Try together to coerce methods that throw exceptions internally:

1
2
val either : Either[String, Int] = Try(Integer.parseInt("1")).transform({ i => Success(Right(i)) }, { e => Success(Left("FAIL")) }).get
Console.println("either is " + either.fold(l => l, r => r))

Try isn’t always appropriate. If we go back to the first exception example, this is the Try analogue:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
val input = new BufferedReader(new FileReader(file))
val results = Seq(
  Try {
    for (line <- Iterator.continually(input.readLine()).takeWhile(_ != null)) {
      Console.println(line)
    }
  },
  Try(input.close())
)

results.foreach { result =>
  result.recover {
    case e:IOException => errorHandler(e)
  }
}

Note the kludge to get around the lack of a finally block to close the stream. Victor Klang and Som Snytt suggested using a value class and transform to enhance Try:

1
2
3
4
5
6
7
8
implicit class TryOps[T](val t: Try[T]) extends AnyVal {
  def eventually[Ignore](effect: => Ignore): Try[T] = {
    val ignoring = (_: Any) => { effect; t }
    t transform (ignoring, ignoring)
  }
}

Try(1 / 0).map(_ + 1) eventually { println("Oppa Gangnam Style") }

Which is cleaner, at the cost of some magic.

Try was originally invented at Twitter to solve a specific problem: when using Future, the exception may be thrown on a different thread than the caller, and so can’t be returned through the stack. By returning an exception instead of throwing it, the system is able to reify the bottom type and let it cross thread boundaries to the calling context.

Try is new enough that people are still getting comfortable with it. I think that it’s a useful addition when try-catch blocks aren’t flexible enough, but it does have a snag: returning Try in a public API means exceptions must be dealt with by the caller. Using Try also implies to the caller that the method has captured all non fatal exceptions itself. If you’re doing this in your trait:

1
def modify(foo:Foo) : Try[Foo]

Then Try should be at the top to ensure exception capture:

1
2
3
def modify(foo:Foo) : Try[Foo] = Try {
  Foo()
}

Because exceptions must be dealt with the caller, you are placing more trust in the caller to handle or delegate a failure appropriately. With try-catch blocks, doing nothing means that the exception can pass up the stack to a top level exception handler. With Try, exceptions must be either returned or handled by each method in the chain, just like checked exceptions.

To pass the exception along, use map:

1
2
3
4
5
def fooToString(foo:Foo) : Try[String] = {
  modify(foo).map { outFoo =>
   outFoo.toString()
  }
}

Or to rethrow the exception up the stack if the return type is Unit:

1
2
3
def doStuff : Unit = {
  val modifiedFoo = modify(foo).get // throws the exception if failure
}

And you want to avoid this:

1
2
3
4
5
6
7
8
modify(foo) match {
  case Failure(f) => {
    // database failure?  don't care, swallow exception.
  }
  case Success(s) => {
    ...
  }
}

If you have a system that needs specific error logging or error recovery, it’s probably safer to stick to unchecked exceptions.

TL;DR

  • Throw Exception to signal unexpected failure in purely functional code.
  • Use Option to return optional values.
  • Use Option(possiblyNull) to avoid instances of Some(null).
  • Use Either to report expected failure.
  • Use Try rather than Either to return exceptions.
  • Use Try rather than a catch block for handling unexpected failure.
  • Use Try when working with Future.
  • Exposing Try in a public API has a similiar effect as a checked exception. Consider using exceptions instead.

Problems Scala Fixes

| Comments

When I tell people I write code in Scala, a typical question is well, why? When it comes to writing code, most of my work is straightforward: SQL database on the backend, some architectural glue, CRUD, some exception handling, transactions handlers and an HTML or JSON front end. The tools have changed, but the problems are usually the same: you could get a website up in 5 minutes with Rails or Dropwizard. So why pick Scala?

It’s a tough question to answer off the bat. If I point to the language features, it doesn’t get the experience across. It’s like explaining why I like English by reading from a grammar book. I don’t like Scala because of its functional aspects or its higher kinded type system. I like Scala because it solves practical, real world problems for me.

You can think of Scala as Java with all the rough edges filed off, with new features that make it easier to write correct code and harder to create bugs. Scala is not a purist’s language — it goes out of its way to make it easy for Java programmers to dip their toes in the pool. You can literally take your Java code and hit a key to create working Scala code.

So what problems does Scala solve?

Let’s start with the single biggest problem in programming, the design flaw that’s caused more errors than anything else combined. Null references.

Solving for Null

Scala avoids null pointer references by providing a special type called Option. Methods that return Option[A] (where A is the type that you want, i.e. Option[String]) will give you an object that is either a wrapper object called ‘Some’ around your type, or None. There are a number of different ways you can use Option, but I’ll just mention the ones I use most. You can chain Options together in Scala using for comprehensions:

1
2
3
4
  for {
     foo <- request.params('foo')
     bar <- request.params('bar')
  } yield myService.process(foo, bar)

or through a map:

1
  request.params('foo').map { foo => logger.debug(foo) }

or through pattern matching.

1
2
3
4
  request.params('foo') match {
    case Some(foo) => { logger.debug(foo) }
    case None => { logger.debug('no foo :-(') }
  }

Not only is this easy, but it’s also safer. You can flirt with NPE saying myOption.get, but if you do that, you deserve what you get. Not having to deal with NPE is a pleasure.

Right Type in the Right Place

What’s the second biggest problem in programming? It’s a huge issue in security and in proving program correctness: invalid, unchecked input.

Take the humble String. The work of manipulating strings is one of the biggest hairballs in programming — they’re pulled in from the environment or embedded in the code itself, and then programs try to figure out how best to deal with them. In one case, a string is displayed to the user and it’s done. In another case, an SQL query is embedded as a query parameter on a web page and passed straight through to the database. To the compiler, they’re just strings and there is no difference between them. But there are some types of strings that are suitable to pass to databases, and some which are not. Ideally, we’d like to tell the compiler that SQL and query parameters have different types. Scala makes this easy.

With the Shapeless library, you can add distinguishing type information to objects and ensure that you can’t pass random input in:

1
2
3
import shapeless.TypeOperators._
type SqlString = Newtype[String, Any]
val x: SqlString = newtype("SELECT * FROM USER")

I’ve called out strings because it’s a good example, but you can also do this for repository IDs. No more this:

1
2
  case class User(id: Int, firstName:String)
  def lookup(id:Int) : User

When you can have this:

1
2
  case class User(id: Id[User], firstName:String)
  def lookup(id:Id[User]) : User

You can also use this to validate input on the front end. One of the big problems with regular expressions is that when you parse a random string for certain kinds of input, you get back… more strings. You may be validating a string as a username (no spaces, no odd characters), but what you’ve got at the end is a string that says it’s a username.

1
2
3
4
val rawInput = request.params('foo')
if (isUsername(rawInput)) {
  val username = rawInput
}

You can replace that with something nicer.

1
   val email : Option[Username] = parseUsername(rawInput)

This embeds the constraint in the type itself. You can design your API to accept Username instead of String, and so enforce a kind of whitelisting.

Can you do this in Java? Yes, but it’s inconvenient. Scala’s type system makes it easy for you, and in 2.10 there will be Value Classes, which will provide this functionality in the core language itself.

Doing the gruntwork for you

The previous example can be improved though. Really, we just want a Username at the end — we don’t want to have to call parseUsername on it. Fortunately, Scala rewards the lazy with implicit conversions. If you define a method like this and use the implicit keyword:

1
  implicit def string2username(input : String) : Option[Username] = parseUsername(input)

And do this:

1
  val email : Option[Username] = rawInput;

Then the compiler is smart enough to see that a String isn’t an Option[Username], and looks through any implicit methods available to do the conversion.

There is an element of ‘magic’ to implicit conversions, especially when you’re reading someone else’s code and trying to figure out where the conversion is happening. You can find the appropriate implicit through the REPL, or through IDEA.

Providing Context

There are many cases in programming where everything depends on a Context object in some way: either you’re using a database connection, or you rely on a security principal, or you’re resolving objects from a request or JAAS / LDAP / Spring context… the list goes on. Whatever it is, it’s passed in by the system, it’s absolutely essential, and you can count on most of your API to depend on it in some way. A typical Java way to deal with this is to make it part of the parameter list, or try to ignore it and make it a ThreadLocal object.

1
   public void doStuff(Context context);

Scala has a better way to deal with this: you can specify implicit parameters on a method.

1
   def doStuff(implicit context:Context)

which means that anything marked as implicit that is in scope will be applied:

1
2
   implicit val context = new Context()
   doStuff  // uses val context automatically.

This is all handled by the compiler: just set up the implicits and Scala will do the rest.

A place for everything

So now you have a number of implicit methods, value classes and type definitions and wotnot. In Scala, there’s a place to keep all this stuff that is so intuitive, you may not think of it as a place at all. It’s the package object.

Package objects are supremely useful. You define a file called package.scala, then in the file you put

1
2
3
package object mypackagename {
  implicit def string2username(input : String) : Option[Username] = parseUsername(input)
}

and after that point, anything with ‘import mypackagename._’ will import the package object as well. One less thing to think about.

Free Data Transfer Objects

Case classes. So called because they’re used in case statements (see below).

1
case class Data(propertyOne:String, propertyTwo:Int)

Immutable, convenient, and packed with functionality. They make creating data types or DTOs trivial. They’re cool.

Free Range (Organic) Checking

Scala contains a powerful pattern matching feature. You can think of it as a switch statement on steroids.

1
2
3
4
a match {
   case Something => doThis
   case SomethingElse => doThat
}

There are so many things that feed into pattern matching — extractor objects, aliases, matching on types, regular expressions and wildcards — it’s the ‘regexp’ of Scala. It takes in an object as input, filters it, and manipulates it in exactly the way you want.

But the thing I really like about pattern matching is what it doesn’t let you do. It doesn’t let you miss something.

There’s a feature called sealed classes which lets you define all the valid types in a file. If you define a trait with the sealed keyword inside a file, then any classes you define inside that file that extend that trait are the ONLY classes that will extend that trait.

1
2
3
sealed trait Message { def msg: String }
case class Success(msg:String) extends Message
case class Failure(msg:String) extends Message

The compiler knows this, and so when you write use pattern matching against that trait, it knows that it must be one of the case classes defined. If not all of the case classes are defined in the match, it will print out a warning method saying that you don’t have an exhaustive match.

1
2
3
4
def log(msg: Message) = msg match {
  case Success(str) => println("Success: " + str)
  case Failure(str) => println("Failure: " + str)
}

And More

But that’s enough for now. I hope this gives you an idea of why I like Scala. If you have any features dear to your heart, add them to the comments and let me know what makes you happy.