This is the fourth in a series of posts about setting up Play WS as a TLS client for a "secure by default" setup and configuration through text files, along with the research and thinking behind the setup. I recommend The Most Dangerous Code in the World for more background.
Previous posts are:
- Fixing The Most Dangerous Code In The World (MITM, Protocols, Cipher Suites, Cert Stores)
- Fixing X.509 Certificates (General PKI, Weak Signature and Key Algorithms)
- Fixing Certificate Revocation (CRL, OCSP)
The Attack: Man in the Middle
The scenario that requires hostname verification is when an attacker is on your local network, and can subvert DNS or ARP, and somehow redirect traffic through his own machine. When you make the call to https://example.com, the attacker can make the response come back to a local IP address, and then send you a TLS handshake with a certificate chain.
The attacker needs you to accept a public key that it owns so that you will continue the conversation with it, so it can't simply hand you the certificate chain that belongs to example.com – that has a different public key, and the attacker can't use it. Also, the attacker can't give you a certificate chain that points to example.com and has the attacker's public key – the CA should (in theory) refuse to sign the certificate, since the domain belongs to someone else.
However… if the attacker get a CA to sign a certificate for a site that it does have control of, then the attack works like this:
In the example, DNS is compromised, but an attacker could just as well proxy the request to another server and return the result from a different server. The key to any kind of check for server identity is that the check of the hostname must happen on the client end, and must be tied to the original request coming in. It must happen out of band, and cannot rely on any response from the server.
The Defense: Hostname Verification
In theory, hostname verification in HTTPS sounds simple enough. You call "https://example.com", save off the "example.com" bit, and then check it against the X.509 certificate from the server. If the names don't match, you terminate the connection.
So where do you look in the certificate? According to RFC 6125, hostname verification should be done against the certificate's subjectAlternativeName's dNSName field. In some legacy implementations, the check is done against the certificate's commonName field, but commonName is deprecated and has been deprecated for quite a while now.
You generate a certificate with the right name by using keytool with the -ext flag to say the certificate has
example.com as the DNS record in the
And to view the certificate:
Is it really that simple? Yes. HTTPS is very specific about verifying server identity. You make an HTTPS request, then you check that the certificate that comes back matches the hostname of the request. There's some bits added on about wildcards, but for the most part it's not complicated.
In fact (and this is part of the problem), you can say that HTTPS is defined by the hostname verification requirement for HTTP on top of TLS. The reason why HTTPS exists as a distinct RFC as apart from TLS is because of the specifics of the hostname verification – LDAP has a distinct secure protocol, LDAPS, which handles hostname verification differently. Every protocol that uses TLS must have its own application level security on top of TLS. TLS, by itself, doesn't define server identity.
Because TLS in its raw form doesn't do hostname verification, anything that uses raw TLS without doing any server identity check is insecure. This is pretty amazing information in itself, so let's break this down, and repeat it in bold face and all caps:
A) VERIFICATION OF SERVER IDENTITY IS APPLICATION PROTOCOL SPECIFIC.
B) BECAUSE OF (A), TLS LEAVES IT TO THE APPLICATION TO DO THE HOSTNAME VERIFICATION.
C) YOU CANNOT SECURELY USE RAW TLS WITHOUT ADDING HOSTNAME VERIFICATION.
Given the previous points and the consequences of failure, this would lead us to believe that there must be a safety system in place to validate the TLS configuration before opening a connection. To my knowledge, no such system exists. This is despite the fact that there are really only three main protocols that use TLS: HTTPS, LDAPS, and IMAPS.
TLS LIBRARIES SHOULD MAKE IT IMPOSSIBLE TO USE THEM RAW WITHOUT ANY HOSTNAME VERIFICATION. THEY DO NOT.
As you might guess, this makes lack of hostname verification a very common failure. The Most Dangerous Code in the World specifically calls out the lack of hostname verification as a very common failure of HTTPS client libraries. This is bad, because man in the middle attacks are extremely common.
In 2011, RFC 6125 was invented to bridge this gap, but most TLS implementations don't support it. In the absence of a known guide, using RFC 2818 is not unreasonable, and certainly better than nothing.
Implementation in JSSE
The JSSE Reference Guide goes out of its way to mention the need for hostname verification.
"IMPORTANT NOTE: When using raw SSLSockets/SSLEngines you should always check the peer's credentials before sending any data. The SSLSocket/SSLEngine classes do not automatically verify, for example, that the hostname in a URL matches the hostname in the peer's credentials. An application could be exploited with URL spoofing if the hostname is not verified."
A little later, the reference guide mentions it again, in context with HttpsURLConnection:
[T]he SSL/TLS protocols do not specify that the credentials received must match those that peer might be expected to send. If the connection were somehow redirected to a rogue peer, but the rogue's credentials presented were acceptable based on the current trust material, the connection would be considered valid. When using raw SSLSockets/SSLEngines you should always check the peer's credentials before sending any data. The SSLSocket and SSLEngine classes do not automatically verify that the hostname in a URL matches the hostname in the peer's credentials. An application could be exploited with URL spoofing if the hostname is not verified.
I've never heard the term "URL spoofing" before, and Google shows nothing remotely connected with this term. Ping me if you've heard of it.
Anyway. JSSE does do hostname verification, if you set it up just right. For completeness, I'm going to go over all the options.
Hostname Verification in 1.6
In 1.6, if you want to use hostname verification, you have one way to do it. If you use
HttpsUrlConnection, then JSSE will do hostname verification for you by default. Other than that, you're on your own. JSSE 1.6 does not provide any public classes for you to extend; it's all internal.
If you want to use hostname verification on an SSLEngine, you have to get at an instance of
sun.security.ssl.SSLEngineImpl and then call
sslEngine.trySetHostnameVerification("HTTPS") on SSLEngine directly, using reflection. This lets
ClientHandshaker pass in the identifier to
Hostname Verification in 1.7
JSSE 1.7 provides you with more options for doing HTTPS hostname verification. In addition to
HttpsUrlConnection, you have the option of using
X509ExtendedTrustManager, because it "enables endpoint verification at the TLS layer." What this means in practice is that
X509ExtendedTrustManager routes through to
X509TrustManagerImpl.checkIdentity, as in JDK 1.6.
The reference guide recommends using
X509ExtendedTrustManager rather than the legacy
X509TrustManager, and even has a worked example. But there's a catch:
X509ExtendedTrustManager is an abstract class, so you must inherit from it. This limits anything fun you might want to do, like aggregating keystore information. As such, it's only useful if you're doing minor tweaks.
In 1.7, the manual method of doing it is not as bad as 1.6. There's an explicit call you can make (recommended by Stack Overflow):
But this doesn't actually work with AsyncHttpClient: you'll get a NullPointerException!
The reason why: AsyncHttpClient creates an SSLEngine without the peerHost and peerPort.
JSSE assumes that if you called
sslParams.setEndpointIdentificationAlgorithm("HTTPS") then you also created the SSL engine like this:
setEndpointIdentificationAlgorithm is not an option. (The lack of hostname could also possibly have an effect on Server Name Indication, although I haven't tested that.)
There is another way to do hostname verification though. You can pass in a custom HostnameVerifier to the SSLContext.
HostnameVerifier is an interface that normally says "if you've tried resolving the hostname yourself and got nothing, then try this." However, since AsyncHttpClient works directly with SSLEngine, the Netty provider will call the HostnameVerifier on every call to do hostname verification. This gives us the avenue we need.
I ended up using Kevin Locke's guide to implement a HostnameVerifier that calls to Sun's internal HostnameChecker, the same way that
setEndpointIdentificationAlgorithm("HTTPS") does. The end result is pretty simple:
After that, I could set it on the builder and have hostname verification triggered:
Disabling hostname verification is a loose option:
And there's an option to use your own hostname verifier if you're not on an Oracle JVM:
Of course, there's always more issues.
The hostname verification needs to happen after an SSL handshake has been completed. If you call
session.getPeerCertificates() before the SSL handshake has been established, you'll get an
SSLPeerUnverifiedException exception. You need to set up an SSL handshake listener (using
SSLHandler for Netty,
SSLBaseFilter.HandshakeListener for Grizzly) and only do hostname verification after the session is valid.
AsyncHttpClient 1.8.5 doesn't use a handshake listener. Instead, it uses a completion handler, and potentially hostname verification may fail unexpectedly if you set a custom hostname verifier. It seems to work most of the time in the field due to a race condition – by the time the completion handler is notified, the handshake has completed already. Still working on this.
Oracle's HostnameChecker does not implement RFC 6125 correctly. Even HTTPClient's
StrictHostnameValidator seems not to be up to spec, and there are many cases where hostname checkers have failed against NULL or bad CN strings.
Nevertheless, it is the way that HTTPS is supposed to work – if you have a problem with hostname verification, you really need to check your X.509 certificate and make sure that your subjectAltName's dnsName field is set correctly.