GitList

doc/tls-crypt-v2.txt

Client-specific tls-crypt keys (--tls-crypt-v2)
===============================================

This document describes the ``--tls-crypt-v2`` option, which enables OpenVPN
to use client-specific ``--tls-crypt`` keys.

Rationale
---------

``--tls-auth`` and ``tls-crypt`` use a pre-shared group key, which is shared
among all clients and servers in an OpenVPN deployment. If any client or
server is compromised, the attacker will have access to this shared key, and it
will no longer provide any security. To reduce the risk of losing pre-shared
keys, ``tls-crypt-v2`` adds the ability to supply each client with a unique
tls-crypt key. This allows large organisations and VPN providers to profit
from the same DoS and TLS stack protection that small deployments can already
achieve using ``tls-auth`` or ``tls-crypt``.

Also, for ``tls-crypt``, even if all these peers succeed in keeping the key
secret, the key lifetime is limited to roughly 8000 years, divided by the
number of clients (see the ``--tls-crypt`` section of the man page). Using
client-specific keys, we lift this lifetime requirement to roughly 8000 years
for each client key (which "Should Be Enough For Everybody (tm)").

Introduction
------------

``tls-crypt-v2`` uses an encrypted cookie mechanism to introduce
client-specific tls-crypt keys without introducing a lot of server-side state.
The client-specific key is encrypted using a server key. The server key is the
same for all servers in a group. When a client connects, it first sends the
encrypted key to the server, such that the server can decrypt the key and all
messages can thereafter be encrypted using the client-specific key.

A wrapped (encrypted and authenticated) client-specific key can also contain
metadata. The metadata is wrapped together with the key, and can be used to
allow servers to identify clients and/or key validity. This allows the server
to abort the connection immediately after receiving the first packet, rather
than performing an entire TLS handshake. Aborting the connection this early
greatly improves the DoS resilience and reduces attack surface against
malicious clients that have the ``tls-crypt`` or ``tls-auth`` key. This is
particularly relevant for large deployments (think lost key or disgruntled
employee) and VPN providers (clients are not trusted).

To allow for a smooth transition, ``tls-crypt-v2`` is designed such that a
server can enable both ``tls-crypt-v2`` and either ``tls-crypt`` or
``tls-auth``. This is achieved by introducing a P_CONTROL_HARD_RESET_CLIENT_V3
opcode, that indicates that the client wants to use ``tls-crypt-v2`` for the
current connection.

For an exact specification and more details, read the Implementation section.

Implementation
--------------

When setting up a tls-crypt-v2 group (similar to generating a tls-crypt or
tls-auth key previously):

1. Generate a tls-crypt-v2 server key using OpenVPN's ``--tls-crypt-v2-genkey server``.
This key contains 2 512-bit keys, of which we use:

* the first 256 bits of key 1 as AES-256-CTR encryption key ``Ke``
* the first 256 bits of key 2 as HMAC-SHA-256 authentication key ``Ka``

This format is similar to the format for regular ``tls-crypt``/``tls-auth``
and data channel keys, which allows us to reuse code.

2. Add the tls-crypt-v2 server key to all server configs
(``tls-crypt-v2 /path/to/server.key``)

When provisioning a client, create a client-specific tls-crypt key:

1. Generate 2048 bits client-specific key ``Kc`` using OpenVPN's ``--tls-crypt-v2-genkey client``

2. Optionally generate metadata

The first byte of the metadata determines the type. The initial
implementation supports the following types:

0x00 (USER): User-defined free-form data.
0x01 (TIMESTAMP): 64-bit network order unix timestamp of key generation.

The timestamp can be used to reject too-old tls-crypt-v2 client keys.

User metadata could for example contain the users certificate serial, such
that the incoming connection can be verified against a CRL.

If no metadata is supplied during key generation, openvpn defaults to the
TIMESTAMP metadata type.

3. Create a wrapped client key ``WKc``, using the same nonce-misuse-resistant
SIV construction we use for tls-crypt:

``len = len(WKc)`` (16 bit, network byte order)

``T = HMAC-SHA256(Ka, len || Kc || metadata)``

``IV = 128 most significant bits of T``

``WKc = T || AES-256-CTR(Ke, IV, Kc || metadata) || len``

Note that the length of ``WKc`` can be computed before composing ``WKc``,
because the length of each component is known (and AES-256-CTR does not add
any padding).

4. Create a tls-crypt-v2 client key: PEM-encode ``Kc || WKc`` and store in a
file, using the header ``-----BEGIN OpenVPN tls-crypt-v2 client key-----``
and the footer ``-----END OpenVPN tls-crypt-v2 client key-----``. (The PEM
format is simple, and following PEM allows us to use the crypto lib function
for en/decoding.)

5. Add the tls-crypt-v2 client key to the client config
(``tls-crypt-v2 /path/to/client-specific.key``)

When setting up the openvpn connection:

1. The client reads the tls-crypt-v2 key from its config, and:

1. loads ``Kc`` as its tls-crypt key,
2. stores ``WKc`` in memory for sending to the server.

2. To start the connection, the client creates a P_CONTROL_HARD_RESET_CLIENT_V3
message, wraps it with tls-crypt using ``Kc`` as the key, and appends
``WKc``. (``WKc`` must not be encrypted, to prevent a chicken-and-egg
problem.)

3. The server receives the P_CONTROL_HARD_RESET_CLIENT_V3 message, and

1. reads the WKc length field from the end of the message, and extracts WKc
from the message
2. unwraps ``WKc``
3. uses unwrapped ``Kc`` to verify the remaining
P_CONTROL_HARD_RESET_CLIENT_V3 message's (encryption and) authentication.

The message is dropped and no error response is sent when either 3.1, 3.2 or
3.3 fails (DoS protection).

4. Server optionally checks metadata using a --tls-crypt-v2-verify script

This allows early abort of connection, *before* we expose any of the
notoriously dangerous TLS, X.509 and ASN.1 parsers and thereby reduces the
attack surface of the server.

The metadata is checked *after* the OpenVPN three-way handshake has
completed, to prevent DoS attacks. (That is, once the client has proved to
the server that it possesses Kc, by authenticating a packet that contains the
session ID picked by the server.)

A server should not send back any error messages if metadata verification
fails, to reduce attack surface and maximize DoS resilience.

6. Client and server use ``Kc`` for (un)wrapping any following control channel
messages.

Considerations
--------------

To allow for a smooth transition, the server implementation allows
``tls-crypt`` or ``tls-auth`` to be used simultaneously with ``tls-crypt-v2``.
This specification does not allow simultaneously using ``tls-crypt-v2`` and
connections without any control channel wrapping, because that would break DoS
resilience.

WKc includes a length field, so we leave the option for future extension of the
P_CONTROL_HEAD_RESET_CLIENT_V3 message open. (E.g. add payload to the reset to
indicate low-level protocol features.)

``tls-crypt-v2`` uses fixed crypto algorithms, because:

* The crypto is used before we can do any negotiation, so the algorithms have
to be predefined.
* The crypto primitives are chosen conservatively, making problems with these
primitives unlikely.
* Making anything configurable adds complexity, both in implementation and
usage. We should not add any more complexity than is absolutely necessary.

Potential ``tls-crypt-v2`` risks:

* Slightly more work on first connection (``WKc`` unwrap + hard reset unwrap)
than with ``tls-crypt`` (hard reset unwrap) or ``tls-auth`` (hard reset auth).
* Flexible metadata allow mistakes
(So we should make it easy to do it right. Provide tooling to create client
keys based on cert serial + CA fingerprint, provide script that uses CRL (if
available) to drop revoked keys.)