411 lines
25 KiB
Plaintext
Raw Normal View History

2020-11-04 17:05:44 -05:00
---
layout: docs
page_title: Consul Core Security Model
description: >-
Security model including requirements, recommendations, and threats for the core Consul product.
2020-11-04 17:05:44 -05:00
---
## Overview
2020-11-04 17:05:44 -05:00
Consul enables automation of network configurations, service discovery, and secure network connectivity across any
2020-11-04 17:05:44 -05:00
cloud or runtime.
Consul uses a lightweight gossip and RPC system which provides various essential features. Both of these systems
provide security mechanisms which should be used to enable confidentiality, integrity and authentication.
2020-11-04 17:05:44 -05:00
Using defense in depth is crucial for Consul security, and deployment requirements may differ drastically depending on
your use case. Some security features for multi-tenant deployments are offered exclusively in the
[Enterprise](/docs/enterprise) version. This documentation may need to be adapted to your
environment, but the general mechanisms for a secure Consul deployment revolve around:
2020-11-04 17:05:44 -05:00
- **mTLS** - Mutual authentication of both the TLS server and client x509 certificates prevents internal abuse from
2020-11-04 17:05:44 -05:00
unauthorized access to network components within the cluster.
- **ACLs** - Enable role-based access controls for authenticated connections by granting capabilities for an individual
human, or machine operator identity via an ACL token to authorize actions within a cluster. Optionally, custom
2020-11-04 17:05:44 -05:00
[authentication methods](/docs/security/acl/auth-methods) can be used to enable trusted external parties to authorize
ACL token creation.
2020-11-06 10:47:22 -05:00
- **Namespaces** <EnterpriseAlert inline /> - Read and write operations can be scoped to a logical namespace to restrict
access to Consul components within a multi-tenant environment.
2020-11-04 17:05:44 -05:00
- **Sentinel Policies** <EnterpriseAlert inline /> - Sentinel policies enable policy-as-code for granular control over
2020-11-06 10:47:22 -05:00
the built-in key-value store.
2020-11-04 17:05:44 -05:00
### Personas
It helps to consider the following types of personas when managing the security requirements of a Consul deployment.
2020-11-04 17:05:44 -05:00
The granularity may change depending on your team's requirements.
- **System Administrator** - This is someone who has access to the underlying infrastructure to the Consul cluster.
2020-11-04 17:05:44 -05:00
Often they have access to SSH or RDP directly into a server within a cluster through a bastion host. Ultimately they
have read, write and execute permissions for the actual Consul binary. This binary is the same for server and client
agents using different configuration files. These users potentially have sudo, administrative, or some other
super-user access to the underlying compute resource. They have access to all persisted data on disk, or in memory.
This would include ACL tokens, certificates, and other secrets stored on the system. Users like these are essentially
totally trusted, as they have administrative rights to the underlying operating-system with the ability to configure,
2020-11-04 17:05:44 -05:00
start, and stop the agent.
- **Consul Administrator** - This is someone (probably the same System Administrator) who has access to define the
Consul agent configurations for servers and clients, and/or have a Consul management ACL token. They also have total
2020-11-04 17:05:44 -05:00
rights to all of the parts in the Consul system including the ability to manage all services within a cluster.
- **Consul Operator** - This is someone who likely has restricted capabilities to use their namespace within a cluster.
- **Developer** - This is someone who is responsible for creating, and possibly deploying applications connected, or
configured with Consul. In some cases they may have no access, or limited capabilities to view Consul information,
2020-11-04 17:05:44 -05:00
such as through metrics, or logs.
- **User** - This is the end user, using applications backed by services managed by Consul. In some cases services may
be public facing on the internet such as a web server, typically through a load-balancer, or ingress gateway. This is
2020-11-04 17:05:44 -05:00
someone who should not have any network access to the Consul agent APIs.
### Secure Configuration
Consul's security model is applicable only if all parts of the system are running with a secure configuration; **Consul
is not secure-by-default.** Without the following mechanisms enabled in Consul's configuration, it may be possible to
2020-11-04 17:05:44 -05:00
abuse access to a cluster. Like all security considerations, administrators must determine what is appropriate for their
environment and adapt these configurations accordingly.
#### Requirements
- **mTLS** - Mutual authentication of both the TLS server and client x509 certificates prevents internal abuse through
2020-11-04 17:05:44 -05:00
unauthorized access to Consul agents within the cluster.
- [`verify_incoming`](/docs/agent/options#verify_incoming) - By default this is false, and should almost always be set
to true to require TLS verification for incoming client connections. This applies to both server RPC and to the
2020-11-04 17:05:44 -05:00
HTTPS API.
- [`verify_incoming_https`](/docs/agent/options#verify_incoming_https) - By default this is false, and should be set
to true to require clients to provide a valid TLS certificate when the Consul HTTPS API is enabled. TLS for the API
may be not be necessary if it is exclusively served over a loopback interface such as `localhost`.
2020-11-04 17:05:44 -05:00
- [`verifing_incoming_rpc`](/docs/agent/options#verify_incoming_rpc) - By default this is false, and should almost
2020-11-04 17:05:44 -05:00
always be set to true to require clients to provide a valid TLS certificate for Consul agent RPCs.
- [`verify_outgoing`](/docs/agent/options#verify_outgoing) - By default this is false, and should be set to true to
require TLS for outgoing connections from server or client agents. Servers that specify `verify_outgoing = true`
will always talk to other servers over TLS, but they still accept non-TLS connections to allow for a transition of
all clients to TLS. Currently the only way to enforce that no client can communicate with a server unencrypted is
to also enable `verify_incoming` which requires client certificates too.
- [`enable_agent_tls_for_checks`](/docs/agent/options#enable_agent_tls_for_checks) - By default this is false, and
should almost always be set to true to require mTLS to set up the client for HTTP or gRPC health checks. This was
added in Consul 1.0.1.
- [`verify_server_hostname`](/docs/agent/options#verify_server_hostname) - By default this is false, and should be
set to true to require that the TLS certificate presented by the servers matches
`server.<datacenter>.<domain>` hostname for outgoing TLS connections. The default configuration does not verify the
hostname of the certificate, only that it is signed by a trusted CA. This setting is critical to prevent a
compromised client agent from being restarted as a server and having all cluster state including all ACL tokens and
Connect CA root keys replicated to it. This setting was introduced in 0.5.1. From version 0.5.1 to 1.4.0 we
documented that `verify_server_hostname` being true implied verify_outgoing however due to a bug this was not the
case so setting only `verify_server_hostname` results in plaintext communication between client and server. See
[CVE-2018-19653](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-19653) for more details. This is fixed
in 1.4.1.
2020-11-04 17:05:44 -05:00
- [`auto_encrypt`](/docs/agent/options#auto_encrypt) - Enables automated TLS certificate distribution for client
agent RPC communication using the Connect CA. Using this configuration a [`ca_file`](/docs/agent/options#ca_file)
and ACL token would still need to be distributed to client agents.
- [`allow_tls`](/docs/agent/options#allow_tls) - By default this is false, and should be set to true on server
agents to allow certificates to be automatically generated and distributed from the Connect CA to client agents.
- [`tls`](/docs/agent/options#tls) - By default this false, and should be set to true on client agents to
automatically request a client TLS certificate from the server's Connect CA.
2020-11-04 17:05:44 -05:00
**Example Server Agent TLS Configuration**
```hcl
verify_incoming = true
verify_outgoing = true
verify_server_hostname = true
2020-11-04 17:05:44 -05:00
ca_file = "consul-agent-ca.pem"
cert_file = "dc1-server-consul-0.pem"
key_file = "dc1-server-consul-0-key.pem"
2020-11-04 17:05:44 -05:00
auto_encrypt {
allow_tls = true
}
```
**Example Client Agent TLS Configuration**
```hcl
verify_incoming = false
verify_outgoing = true
verify_server_hostname = true
2020-11-04 17:05:44 -05:00
ca_file = "consul-agent-ca.pem"
2020-11-04 17:05:44 -05:00
auto_encrypt {
tls = true
}
```
-> The client agent TLS configuration from above sets [`verify_incoming`](/docs/agent/options#verify_incoming) to
false which assumes all incoming traffic is restricted to `localhost`. The primary benefit for this configuration
would be to avoid provisioning client TLS certificates (in addition to ACL tokens) for all tools or applications
2020-11-04 17:05:44 -05:00
using the local Consul agent. In this case ACLs should be enabled to provide authorization and only ACL tokens would
need to be distributed.
- **ACLs** - The access control list (ACL) system provides a security mechanism for Consul administrators to grant
capabilities tied to an individual human, or machine operator identity. To ultimately secure the ACL system,
administrators should configure the [`default_policy`](/docs/agent/options#acl_default_policy) to "deny".
The [system](/docs/acl/acl-system) is comprised of five major components:
2020-11-04 17:05:44 -05:00
- **🗝 Token** - API key associated with policies, roles, or service identities.
- **📜 Policy** - Set of rules to grant or deny access to various Consul resources.
2020-11-04 17:05:44 -05:00
- **🎭 Role** - Grouping of policies, and service identities.
- **👤 Service or Node Identity** - Synthetic policy granting a predefined set of permissions typical for services
2020-11-04 17:05:44 -05:00
deployed within Consul.
- **🏷 Namespace** <EnterpriseAlert inline /> - a named, logical scoping of Consul Enterprise resources, typically to
2020-11-06 10:47:22 -05:00
enable multi-tenant environments. Consul OSS clusters always operate within the “default” namespace.
2020-11-04 17:05:44 -05:00
- **Gossip Encryption** - A shared, base64-encoded 32-byte symmetric key is required to [encrypt Serf gossip
communication](https://learn.hashicorp.com/tutorials/consul/gossip-encryption-secure) within a cluster using
AES GCM. The key size determines which AES encryption types to use; 16, 24, or 32 bytes to select AES-128, AES-192,
or AES-256 respectively. 32-byte keys are ultimately preferable and is the default size generated by the
[`keygen`](/commands/keygen) command. This key should be
[regularly rotated](https://support.hashicorp.com/hc/en-us/articles/360044051754-Consul-Gossip-Key-Rotation) using
2020-11-05 16:33:04 -05:00
the builtin [keyring management](/commands/keyring) features of Consul.
Two optional gossip encryption options enable Consul servers without gossip encryption to safely upgrade. After
2020-11-04 17:05:44 -05:00
upgrading, the verification options should be enabled, or removed to set them to their default state:
- [`encrypt_verify_incoming`](/docs/agent/options#encrypt_verify_incoming) - By default this is true to enforce
encryption on _incoming_ gossip communications.
2020-11-04 17:05:44 -05:00
- [`encrypt_verify_outgoing`](/docs/agent/options#encrypt_verify_outgoing) - By default this is true to enforce
encryption on _outgoing_ gossip communications.
2020-11-04 17:05:44 -05:00
- **Namespaces** <EnterpriseAlert inline /> - Read and write operations should be scoped to logical namespaces to
restrict access to Consul components within a multi-tenant environment. Furthermore, this feature can be used to
2020-11-06 10:47:22 -05:00
enable a self-service approach to Consul ACL administration for teams within a scoped namespace.
2020-11-04 17:05:44 -05:00
- **Sentinel Policies** <EnterpriseAlert inline /> - Sentinel policies allow for granular control over the builtin
2020-11-04 17:05:44 -05:00
key-value store.
- **Ensure Script Checks are Disabled** - Consuls agent optionally has an HTTP API, which can be exposed beyond
`localhost`. If this is the case, `enable_script_checks` must be false otherwise, even with ACLs configured, script
checks present a remote code execution threat. `enable_local_script_checks` provides a secure alternative if the
HTTP API must be exposed and is available from 1.3.0 on. This feature was also back-ported to patch releases 0.9.4,
2020-11-04 17:05:44 -05:00
1.1.1, and 1.2.4 as described here. This is not enabled by default.
- **Ensure Remote Execution is Disabled** - Consul includes a consul exec feature allowing execution of arbitrary
commands across the cluster. This is disabled by default since 0.8.0. We recommend leaving it disabled. If enabled,
2020-11-04 17:05:44 -05:00
extreme care must be taken to ensure correct ACLs restrict access to execute arbitrary code on the cluster.
#### Recommendations
- **Rotate Credentials** - Using short-lived credentials and rotating them frequently is highly recommended for
2020-11-04 17:05:44 -05:00
production environments to limit the blast radius from potentially compromised secrets, and enabling basic auditing.
- **ACL Tokens** - Consul APIs require an ACL token to authorize actions within a cluster.
- **X.509 Certificates** - Rotate certificates used by the Consul agent; e.g. integrate with Vault's PKI secret engine
to automatically generate and renew dynamic, unique X.509 certificates for each Consul node with a short TTL. Client
certificates can be automatically rotated by Consul when using `auto_encrypt` such that only server certificates
2020-11-04 17:05:44 -05:00
would be managed by Vault.
- **Gossip Keys** - Rotating the encryption keys used by the internal gossip protocol for Consul agents can be
2020-11-04 17:05:44 -05:00
regularly rotated using the builtin keyring management features.
- **Running without Root** - Consul agents can be run as unprivileged users that only require access to the
2020-11-04 17:05:44 -05:00
data directory.
- **Linux Security Modules** - Use of security modules that can be directly integrated into operating systems such as
2020-11-04 17:05:44 -05:00
AppArmor, SElinux, and Seccomp on Consul agent hosts.
- **Customize TLS Settings** - TLS settings such as the [available cipher suites](/docs/agent/options#tls_cipher_suites),
should be tuned to fit the needs of your environment.
- [`tls_min_version`](/docs/agent/options#tls_min_version) - Used to specify the minimum TLS version to use.
2020-11-04 17:05:44 -05:00
- [`tls_cipher_suites`](/docs/agent/options#tls_cipher_suites) - Used to specify which TLS cipher suites are allowed.
- [`tls_prefer_server_cipher_suites`](/docs/agent/options#tls_prefer_server_cipher_suites) - Used to specify which TLS
cipher suites are preferred on the server side.
- **Customize HTTP Response Headers** - Additional security headers, such as
[`X-XSS-Protection`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-XSS-Protection), can be
2020-11-04 17:05:44 -05:00
[configured](https://www.consul.io/docs/agent/options#response_headers) for HTTP API responses.
```hcl
http_config {
reponse_headers {
"X-Frame-Options" = "DENY"
}
}
```
- **Customize Default Limits** - Consul has a number of builtin features with default connection limits that should be
tuned to fit your environment.
- [`http_max_conns_per_client`](/docs/agent/options#http_max_conns_per_client) - Used to limit concurrent access from
a single client to the HTTP(S) endpoint on Consul agents.
- [`https_handshake_timeout`](/docs/agent/options#https_handshake_timeout) - Used to timeout TLS connection for the
2020-11-04 17:05:44 -05:00
HTTP(S) endpoint for Consul agents.
- [`rpc_handshake_timeout`](/docs/agent/options#rpc_handshake_timeout) - Used to timeout TLS connections for the RPC
endpoint for Consul agents.
2020-11-04 17:05:44 -05:00
- [`rpc_max_conns_per_client`](/docs/agent/options#rpc_max_conns_per_client) - Used to limit concurrent access from a
single client to the RPC endpoint on Consul agents.
- [`rpc_rate`](/docs/agent/options#rpc_rate) - Disabled by default, this is used to limit (requests/second) for client
agents making RPC calls to server agents.
- [`rpc_max_burst`](/docs/agent/options#rpc_max_burst) - Used as the token bucket size for client agents making RPC
calls to server agents.
2020-11-04 17:05:44 -05:00
- [`kv_max_value_size`](/docs/agent/options#kv_max_value_size) - Used to configure the max number of bytes in a
2020-11-04 17:05:44 -05:00
key-value API request.
- [`txn_max_req_len`](/docs/agent/options#txn_max_req_len) - Used to configure the max number of bytes in a
2020-11-04 17:05:44 -05:00
transaction API request.
- **Secure UI Access** - Access to Consuls builtin UI can be secured in various ways:
- **mTLS** - Enabling the HTTPS with mutual TLS authentication is recommended, but requires extra tooling to terminate
the mTLS connection, preferably on an operator's local machine using a proxy script.
- **TLS** - Enabling the HTTPS is recommended where mTLS may not be required for UI access, such as when ACLs are
2020-11-04 17:05:44 -05:00
configured with a default deny.
- **ACL** - ACLs with a default deny policy enables safer UI access by preventing unauthorized access to sensitive
2020-11-04 17:05:44 -05:00
components within the cluster.
- **Restrict HTTP Writes** - Using the allow_write_http_from configuration option enables agent endpoints restricting
write capabilities to a list of CIDRs.
**Example Agent Configuration**
```hcl
http_config {
allow_write_http_from = ["127.0.0.0/8"]
}
```
2020-11-04 17:05:44 -05:00
### Threat Model
The following are parts of the core Consul threat model:
- **Consul agent-to-agent communication** - Communication between Consul agents should be secure from eavesdropping.
2020-11-04 17:05:44 -05:00
This requires transport encryption to be enabled on the cluster and covers both TCP and UDP traffic.
- **Consul agent-to-CA communication** - Communication between the Consul server and the configured certificate
2020-11-04 17:05:44 -05:00
authority provider for Connect is always encrypted.
- **Tampering of data in transit** - Any tampering should be detectable and cause Consul to avoid processing
2020-11-04 17:05:44 -05:00
the request.
- **Access to data without authentication or authorization** - All requests must be authenticated and authorized. This
requires that ACLs are enabled on the cluster with a default deny mode.
2020-11-04 17:05:44 -05:00
- **State modification or corruption due to malicious messages** - Ill-formatted messages are discarded and
2020-11-04 17:05:44 -05:00
well-formatted messages require authentication and authorization.
- **Non-server members accessing raw data** - All servers must join the cluster (with proper authentication and
2020-11-04 17:05:44 -05:00
authorization) to begin participating in Raft. Raft data is transmitted over TLS.
- **Denial of Service against a node** - DoS attacks against a node should not compromise the security stance of
the software.
2020-11-04 17:05:44 -05:00
- **Connect-based Service-to-Service communication** - Communications between two Connect-enabled services (natively or
2020-11-04 17:05:44 -05:00
by proxy) should be secure from eavesdropping and provide authentication. This is achieved via mutual TLS.
The following are not part of the threat model for server agents:
- **Access (read or write) to the Consul data directory** - All Consul servers, including non-leaders, persist the full
set of Consul state to this directory. The data includes all KV, service registrations, ACL tokens, Connect CA
2020-11-04 17:05:44 -05:00
configuration, and more. Any read or write to this directory allows an attacker to access and tamper with that data.
- **Access (read or write) to the Consul configuration directory** - Consul configuration can enable or disable the ACL
system, modify data directory paths, and more. Any read or write of this directory allows an attacker to reconfigure
2020-11-04 17:05:44 -05:00
many aspects of Consul. By disabling the ACL system, this may give an attacker access to all Consul data.
- **Memory access to a running Consul server agent** - If an attacker is able to inspect the memory state of a running
Consul server agent the confidentiality of almost all Consul data may be compromised. If you're using an external
Connect CA, the root private key material is never available to the Consul process and can be considered safe. Service
Connect TLS certificates should be considered compromised; they are never persisted by server agents but do exist
2020-11-04 17:05:44 -05:00
in-memory during at least the duration of a Sign request.
The following are not part of the threat model for client agents:
- **Access (read or write) to the Consul data directory** - Consul clients will use the data directory to cache local
state. This includes local services, associated ACL tokens, Connect TLS certificates, and more. Read or write access
to this directory will allow an attacker to access this data. This data is typically a smaller subset of the full data
of the cluster.
- **Access (read or write) to the Consul configuration directory** - Consul client configuration files contain the
address and port information of services, default ACL tokens for the agent, and more. Access to Consul configuration
could enable an attacker to change the port of a service to a malicious port, register new services, and more.
Further, some service definitions have ACL tokens attached that could be used cluster-wide to impersonate that
service. An attacker cannot change cluster-wide configurations such as disabling the ACL system.
- **Memory access to a running Consul client agent** - The blast radius of this is much smaller than a server agent but
the confidentiality of a subset of data can still be compromised. Particularly, any data requested against the agent's
API including services, KV, and Connect information may be compromised. If a particular set of data on the server was
never requested by the agent, it never enters the agent's memory since replication only exists between servers. An
attacker could also potentially extract ACL tokens used for service registration on this agent, since the tokens must
be stored in-memory alongside the registered service.
- **Network access to a local Connect proxy or service** - Communications between a service and a Connect-aware proxy
are generally unencrypted and must happen over a trusted network. This is typically a loopback device. This requires
that other processes on the same machine are trusted, or more complex isolation mechanisms are used such as network
namespaces. This also requires that external processes cannot communicate to the Connect service or proxy (except on
the inbound port). Therefore, non-native Connect applications should only bind to non-public addresses.
- **Improperly Implemented Connect proxy or service** - A Connect proxy or natively integrated service must correctly
serve a valid leaf certificate, verify the inbound TLS client certificate, and call the Consul agent-local authorized
endpoint. If any of this isn't performed correctly, the proxy or service may allow unauthenticated or unauthorized
connections.
2020-11-04 17:05:44 -05:00
#### Internal Threats
- **Operator** - A malicious internal Consul operator with a valid mTLS certificate and ACL token may still be a threat
to your cluster in certain situations, especially in multi-team deployments. They may accidentally or intentionally
2020-11-04 17:05:44 -05:00
abuse access to Consul components which can help be protected against using Namespace, and Sentinel policies.
- **Application** - A malicious internal application, suchs as a compromised third-party dependency with access to a
Consul agent, along with the TLS certificate or ACL token used by the local agent, could effectively do anything the
token permits. Consider enabling HTTPS for the local Consul agent API, enforcing full mutual TLS verification,
2020-11-04 17:05:44 -05:00
segmenting services using namespaces, as well as configuring OS users, groups, and file permissions to build a defense-in-depth approach.
- **RPC** - Malicious actors with access to a Consul agent RPC endpoint may be able to impersonate Consul server agents
2020-11-04 17:05:44 -05:00
if mTLS is not properly configured to verify the client TLS certificate identity. Consul should also have ACLs enabled
with a default policy explicitly set to deny to require authorization.
- **HTTP** - Malicious actors with access to a Consul agent HTTP(S) endpoint may be able to impersonate the agents
2020-11-04 17:05:44 -05:00
configured identity, and extract information from Consul when ACLs are disabled.
- **DNS** - Malicious actors with access to a Consul agent DNS endpoint may be able to extract service catalog
information. Gossip - Malicious actors with access to a Consul agent Serf gossip endpoint may be able to impersonate
2020-11-04 17:05:44 -05:00
agents within a datacenter. Gossip encryption should be enabled, with a regularly rotated gossip key.
- **Proxy (xDS)** - Malicious actors with access to a Consul agent xDS endpoint may be able to extract Envoy service
information. When ACLs and HTTPS are enabled, the gRPC endpoint serving up the xDS service requires (m)TLS and a
2020-11-04 17:05:44 -05:00
valid ACL token.
#### External Threats
- **Agents** - External access to the Consul agents various network endpoints should be considered including the
gossip, HTTP, RPC, and gRPC ports. Furthermore, access through other services like SSH or `exec` functionality in
orchestration systems such as Nomad and Kubernetes may expose unencrypted information persisted to disk including
TLS certificates or ACL tokens. Access to the Consul agent directory is explicitly outside the scope of Consuls
2020-11-04 17:05:44 -05:00
threat model and should only be exposed to authenticated and authorized users.
- **Gateways** - Consul supports a variety of [gateways](/docs/connect/gateways) to allow traffic in-and-out of the
service mesh to support a variety of workloads. When using an internet-exposed gateway, you should be sure to harden
your Consul agent and host configurations. In most configurations, ACLS, gossip encryption, and mTLS should be
enforced. If an [escape hatch override](https://www.consul.io/docs/connect/proxies/envoy#escape-hatch-overrides) is
required, the proxy configuration should be audited to ensure security configurations remain intact, and do not
2020-11-04 17:05:44 -05:00
violate Consuls security model.