mirror of https://github.com/status-im/consul.git
411 lines
25 KiB
Plaintext
411 lines
25 KiB
Plaintext
---
|
||
layout: docs
|
||
page_title: Consul Core Security Model
|
||
description: >-
|
||
Security model including requirements, recommendations, and threats for the core Consul product.
|
||
---
|
||
|
||
## Overview
|
||
|
||
Consul enables automation of network configurations, service discovery, and secure network connectivity across any
|
||
cloud or runtime.
|
||
|
||
Consul uses a lightweight gossip and RPC system which provides various essential features. Both of these systems
|
||
provide security mechanisms which should be used to enable confidentiality, integrity and authentication.
|
||
|
||
Using defense in depth is crucial for Consul security, and deployment requirements may differ drastically depending on
|
||
your use case. Some security features for multi-tenant deployments are offered exclusively in the
|
||
[Enterprise](/docs/enterprise) version. This documentation may need to be adapted to your
|
||
environment, but the general mechanisms for a secure Consul deployment revolve around:
|
||
|
||
- **mTLS** - Mutual authentication of both the TLS server and client x509 certificates prevents internal abuse from
|
||
unauthorized access to network components within the cluster.
|
||
|
||
- **ACLs** - Enable role-based access controls for authenticated connections by granting capabilities for an individual
|
||
human, or machine operator identity via an ACL token to authorize actions within a cluster. Optionally, custom
|
||
[authentication methods](/docs/security/acl/auth-methods) can be used to enable trusted external parties to authorize
|
||
ACL token creation.
|
||
|
||
- **Namespaces** <EnterpriseAlert inline /> - Read and write operations can be scoped to a logical namespace to restrict
|
||
access to Consul components within a multi-tenant environment.
|
||
|
||
- **Sentinel Policies** <EnterpriseAlert inline /> - Sentinel policies enable policy-as-code for granular control over
|
||
the built-in key-value store.
|
||
|
||
### Personas
|
||
|
||
It helps to consider the following types of personas when managing the security requirements of a Consul deployment.
|
||
The granularity may change depending on your team's requirements.
|
||
|
||
- **System Administrator** - This is someone who has access to the underlying infrastructure to the Consul cluster.
|
||
Often they have access to SSH or RDP directly into a server within a cluster through a bastion host. Ultimately they
|
||
have read, write and execute permissions for the actual Consul binary. This binary is the same for server and client
|
||
agents using different configuration files. These users potentially have sudo, administrative, or some other
|
||
super-user access to the underlying compute resource. They have access to all persisted data on disk, or in memory.
|
||
This would include ACL tokens, certificates, and other secrets stored on the system. Users like these are essentially
|
||
totally trusted, as they have administrative rights to the underlying operating-system with the ability to configure,
|
||
start, and stop the agent.
|
||
|
||
- **Consul Administrator** - This is someone (probably the same System Administrator) who has access to define the
|
||
Consul agent configurations for servers and clients, and/or have a Consul management ACL token. They also have total
|
||
rights to all of the parts in the Consul system including the ability to manage all services within a cluster.
|
||
|
||
- **Consul Operator** - This is someone who likely has restricted capabilities to use their namespace within a cluster.
|
||
|
||
- **Developer** - This is someone who is responsible for creating, and possibly deploying applications connected, or
|
||
configured with Consul. In some cases they may have no access, or limited capabilities to view Consul information,
|
||
such as through metrics, or logs.
|
||
|
||
- **User** - This is the end user, using applications backed by services managed by Consul. In some cases services may
|
||
be public facing on the internet such as a web server, typically through a load-balancer, or ingress gateway. This is
|
||
someone who should not have any network access to the Consul agent APIs.
|
||
|
||
### Secure Configuration
|
||
|
||
Consul's security model is applicable only if all parts of the system are running with a secure configuration; **Consul
|
||
is not secure-by-default.** Without the following mechanisms enabled in Consul's configuration, it may be possible to
|
||
abuse access to a cluster. Like all security considerations, administrators must determine what is appropriate for their
|
||
environment and adapt these configurations accordingly.
|
||
|
||
#### Requirements
|
||
|
||
- **mTLS** - Mutual authentication of both the TLS server and client x509 certificates prevents internal abuse through
|
||
unauthorized access to Consul agents within the cluster.
|
||
|
||
- [`verify_incoming`](/docs/agent/options#verify_incoming) - By default this is false, and should almost always be set
|
||
to true to require TLS verification for incoming client connections. This applies to both server RPC and to the
|
||
HTTPS API.
|
||
|
||
- [`verify_incoming_https`](/docs/agent/options#verify_incoming_https) - By default this is false, and should be set
|
||
to true to require clients to provide a valid TLS certificate when the Consul HTTPS API is enabled. TLS for the API
|
||
may be not be necessary if it is exclusively served over a loopback interface such as `localhost`.
|
||
|
||
- [`verify_incoming_rpc`](/docs/agent/options#verify_incoming_rpc) - By default this is false, and should almost
|
||
always be set to true to require clients to provide a valid TLS certificate for Consul agent RPCs.
|
||
|
||
- [`verify_outgoing`](/docs/agent/options#verify_outgoing) - By default this is false, and should be set to true to
|
||
require TLS for outgoing connections from server or client agents. Servers that specify `verify_outgoing = true`
|
||
will always talk to other servers over TLS, but they still accept non-TLS connections to allow for a transition of
|
||
all clients to TLS. Currently the only way to enforce that no client can communicate with a server unencrypted is
|
||
to also enable `verify_incoming` which requires client certificates too.
|
||
|
||
- [`enable_agent_tls_for_checks`](/docs/agent/options#enable_agent_tls_for_checks) - By default this is false, and
|
||
should almost always be set to true to require mTLS to set up the client for HTTP or gRPC health checks. This was
|
||
added in Consul 1.0.1.
|
||
|
||
- [`verify_server_hostname`](/docs/agent/options#verify_server_hostname) - By default this is false, and should be
|
||
set to true to require that the TLS certificate presented by the servers matches
|
||
`server.<datacenter>.<domain>` hostname for outgoing TLS connections. The default configuration does not verify the
|
||
hostname of the certificate, only that it is signed by a trusted CA. This setting is critical to prevent a
|
||
compromised client agent from being restarted as a server and having all cluster state including all ACL tokens and
|
||
Connect CA root keys replicated to it. This setting was introduced in 0.5.1. From version 0.5.1 to 1.4.0 we
|
||
documented that `verify_server_hostname` being true implied verify_outgoing however due to a bug this was not the
|
||
case so setting only `verify_server_hostname` results in plaintext communication between client and server. See
|
||
[CVE-2018-19653](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-19653) for more details. This is fixed
|
||
in 1.4.1.
|
||
|
||
- [`auto_encrypt`](/docs/agent/options#auto_encrypt) - Enables automated TLS certificate distribution for client
|
||
agent RPC communication using the Connect CA. Using this configuration a [`ca_file`](/docs/agent/options#ca_file)
|
||
and ACL token would still need to be distributed to client agents.
|
||
|
||
- [`allow_tls`](/docs/agent/options#allow_tls) - By default this is false, and should be set to true on server
|
||
agents to allow certificates to be automatically generated and distributed from the Connect CA to client agents.
|
||
|
||
- [`tls`](/docs/agent/options#tls) - By default this false, and should be set to true on client agents to
|
||
automatically request a client TLS certificate from the server's Connect CA.
|
||
|
||
**Example Server Agent TLS Configuration**
|
||
|
||
```hcl
|
||
verify_incoming = true
|
||
verify_outgoing = true
|
||
verify_server_hostname = true
|
||
|
||
ca_file = "consul-agent-ca.pem"
|
||
cert_file = "dc1-server-consul-0.pem"
|
||
key_file = "dc1-server-consul-0-key.pem"
|
||
|
||
auto_encrypt {
|
||
allow_tls = true
|
||
}
|
||
```
|
||
|
||
**Example Client Agent TLS Configuration**
|
||
|
||
```hcl
|
||
verify_incoming = false
|
||
verify_outgoing = true
|
||
verify_server_hostname = true
|
||
|
||
ca_file = "consul-agent-ca.pem"
|
||
|
||
auto_encrypt {
|
||
tls = true
|
||
}
|
||
```
|
||
|
||
-> The client agent TLS configuration from above sets [`verify_incoming`](/docs/agent/options#verify_incoming) to
|
||
false which assumes all incoming traffic is restricted to `localhost`. The primary benefit for this configuration
|
||
would be to avoid provisioning client TLS certificates (in addition to ACL tokens) for all tools or applications
|
||
using the local Consul agent. In this case ACLs should be enabled to provide authorization and only ACL tokens would
|
||
need to be distributed.
|
||
|
||
- **ACLs** - The access control list (ACL) system provides a security mechanism for Consul administrators to grant
|
||
capabilities tied to an individual human, or machine operator identity. To ultimately secure the ACL system,
|
||
administrators should configure the [`default_policy`](/docs/agent/options#acl_default_policy) to "deny".
|
||
|
||
The [system](/docs/acl/acl-system) is comprised of five major components:
|
||
|
||
- **🗝 Token** - API key associated with policies, roles, or service identities.
|
||
|
||
- **📜 Policy** - Set of rules to grant or deny access to various Consul resources.
|
||
|
||
- **🎭 Role** - Grouping of policies, and service identities.
|
||
|
||
- **👤 Service or Node Identity** - Synthetic policy granting a predefined set of permissions typical for services
|
||
deployed within Consul.
|
||
|
||
- **🏷 Namespace** <EnterpriseAlert inline /> - a named, logical scoping of Consul Enterprise resources, typically to
|
||
enable multi-tenant environments. Consul OSS clusters always operate within the “default” namespace.
|
||
|
||
- **Gossip Encryption** - A shared, base64-encoded 32-byte symmetric key is required to [encrypt Serf gossip
|
||
communication](https://learn.hashicorp.com/tutorials/consul/gossip-encryption-secure) within a cluster using
|
||
AES GCM. The key size determines which AES encryption types to use; 16, 24, or 32 bytes to select AES-128, AES-192,
|
||
or AES-256 respectively. 32-byte keys are ultimately preferable and is the default size generated by the
|
||
[`keygen`](/commands/keygen) command. This key should be
|
||
[regularly rotated](https://support.hashicorp.com/hc/en-us/articles/360044051754-Consul-Gossip-Key-Rotation) using
|
||
the builtin [keyring management](/commands/keyring) features of Consul.
|
||
|
||
Two optional gossip encryption options enable Consul servers without gossip encryption to safely upgrade. After
|
||
upgrading, the verification options should be enabled, or removed to set them to their default state:
|
||
|
||
- [`encrypt_verify_incoming`](/docs/agent/options#encrypt_verify_incoming) - By default this is true to enforce
|
||
encryption on _incoming_ gossip communications.
|
||
|
||
- [`encrypt_verify_outgoing`](/docs/agent/options#encrypt_verify_outgoing) - By default this is true to enforce
|
||
encryption on _outgoing_ gossip communications.
|
||
|
||
- **Namespaces** <EnterpriseAlert inline /> - Read and write operations should be scoped to logical namespaces to
|
||
restrict access to Consul components within a multi-tenant environment. Furthermore, this feature can be used to
|
||
enable a self-service approach to Consul ACL administration for teams within a scoped namespace.
|
||
|
||
- **Sentinel Policies** <EnterpriseAlert inline /> - Sentinel policies allow for granular control over the builtin
|
||
key-value store.
|
||
|
||
- **Ensure Script Checks are Disabled** - Consul’s agent optionally has an HTTP API, which can be exposed beyond
|
||
`localhost`. If this is the case, `enable_script_checks` must be false otherwise, even with ACLs configured, script
|
||
checks present a remote code execution threat. `enable_local_script_checks` provides a secure alternative if the
|
||
HTTP API must be exposed and is available from 1.3.0 on. This feature was also back-ported to patch releases 0.9.4,
|
||
1.1.1, and 1.2.4 as described here. This is not enabled by default.
|
||
|
||
- **Ensure Remote Execution is Disabled** - Consul includes a consul exec feature allowing execution of arbitrary
|
||
commands across the cluster. This is disabled by default since 0.8.0. We recommend leaving it disabled. If enabled,
|
||
extreme care must be taken to ensure correct ACLs restrict access to execute arbitrary code on the cluster.
|
||
|
||
#### Recommendations
|
||
|
||
- **Rotate Credentials** - Using short-lived credentials and rotating them frequently is highly recommended for
|
||
production environments to limit the blast radius from potentially compromised secrets, and enabling basic auditing.
|
||
|
||
- **ACL Tokens** - Consul API’s require an ACL token to authorize actions within a cluster.
|
||
|
||
- **X.509 Certificates** - Rotate certificates used by the Consul agent; e.g. integrate with Vault's PKI secret engine
|
||
to automatically generate and renew dynamic, unique X.509 certificates for each Consul node with a short TTL. Client
|
||
certificates can be automatically rotated by Consul when using `auto_encrypt` such that only server certificates
|
||
would be managed by Vault.
|
||
|
||
- **Gossip Keys** - Rotating the encryption keys used by the internal gossip protocol for Consul agents can be
|
||
regularly rotated using the builtin keyring management features.
|
||
|
||
- **Running without Root** - Consul agents can be run as unprivileged users that only require access to the
|
||
data directory.
|
||
|
||
- **Linux Security Modules** - Use of security modules that can be directly integrated into operating systems such as
|
||
AppArmor, SElinux, and Seccomp on Consul agent hosts.
|
||
|
||
- **Customize TLS Settings** - TLS settings such as the [available cipher suites](/docs/agent/options#tls_cipher_suites),
|
||
should be tuned to fit the needs of your environment.
|
||
|
||
- [`tls_min_version`](/docs/agent/options#tls_min_version) - Used to specify the minimum TLS version to use.
|
||
|
||
- [`tls_cipher_suites`](/docs/agent/options#tls_cipher_suites) - Used to specify which TLS cipher suites are allowed.
|
||
|
||
- [`tls_prefer_server_cipher_suites`](/docs/agent/options#tls_prefer_server_cipher_suites) - Used to specify which TLS
|
||
cipher suites are preferred on the server side.
|
||
|
||
- **Customize HTTP Response Headers** - Additional security headers, such as
|
||
[`X-XSS-Protection`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-XSS-Protection), can be
|
||
[configured](https://www.consul.io/docs/agent/options#response_headers) for HTTP API responses.
|
||
|
||
```hcl
|
||
http_config {
|
||
response_headers {
|
||
"X-Frame-Options" = "DENY"
|
||
}
|
||
}
|
||
```
|
||
|
||
- **Customize Default Limits** - Consul has a number of builtin features with default connection limits that should be
|
||
tuned to fit your environment.
|
||
|
||
- [`http_max_conns_per_client`](/docs/agent/options#http_max_conns_per_client) - Used to limit concurrent access from
|
||
a single client to the HTTP(S) endpoint on Consul agents.
|
||
|
||
- [`https_handshake_timeout`](/docs/agent/options#https_handshake_timeout) - Used to timeout TLS connection for the
|
||
HTTP(S) endpoint for Consul agents.
|
||
|
||
- [`rpc_handshake_timeout`](/docs/agent/options#rpc_handshake_timeout) - Used to timeout TLS connections for the RPC
|
||
endpoint for Consul agents.
|
||
|
||
- [`rpc_max_conns_per_client`](/docs/agent/options#rpc_max_conns_per_client) - Used to limit concurrent access from a
|
||
single client to the RPC endpoint on Consul agents.
|
||
|
||
- [`rpc_rate`](/docs/agent/options#rpc_rate) - Disabled by default, this is used to limit (requests/second) for client
|
||
agents making RPC calls to server agents.
|
||
|
||
- [`rpc_max_burst`](/docs/agent/options#rpc_max_burst) - Used as the token bucket size for client agents making RPC
|
||
calls to server agents.
|
||
|
||
- [`kv_max_value_size`](/docs/agent/options#kv_max_value_size) - Used to configure the max number of bytes in a
|
||
key-value API request.
|
||
|
||
- [`txn_max_req_len`](/docs/agent/options#txn_max_req_len) - Used to configure the max number of bytes in a
|
||
transaction API request.
|
||
|
||
- **Secure UI Access** - Access to Consul’s builtin UI can be secured in various ways:
|
||
|
||
- **mTLS** - Enabling the HTTPS with mutual TLS authentication is recommended, but requires extra tooling to terminate
|
||
the mTLS connection, preferably on an operator's local machine using a proxy script.
|
||
|
||
- **TLS** - Enabling the HTTPS is recommended where mTLS may not be required for UI access, such as when ACLs are
|
||
configured with a default deny.
|
||
|
||
- **ACL** - ACLs with a default deny policy enables safer UI access by preventing unauthorized access to sensitive
|
||
components within the cluster.
|
||
|
||
- **Restrict HTTP Writes** - Using the allow_write_http_from configuration option enables agent endpoints restricting
|
||
write capabilities to a list of CIDRs.
|
||
|
||
**Example Agent Configuration**
|
||
|
||
```hcl
|
||
http_config {
|
||
allow_write_http_from = ["127.0.0.0/8"]
|
||
}
|
||
```
|
||
|
||
### Threat Model
|
||
|
||
The following are parts of the core Consul threat model:
|
||
|
||
- **Consul agent-to-agent communication** - Communication between Consul agents should be secure from eavesdropping.
|
||
This requires transport encryption to be enabled on the cluster and covers both TCP and UDP traffic.
|
||
|
||
- **Consul agent-to-CA communication** - Communication between the Consul server and the configured certificate
|
||
authority provider for Connect is always encrypted.
|
||
|
||
- **Tampering of data in transit** - Any tampering should be detectable and cause Consul to avoid processing
|
||
the request.
|
||
|
||
- **Access to data without authentication or authorization** - All requests must be authenticated and authorized. This
|
||
requires that ACLs are enabled on the cluster with a default deny mode.
|
||
|
||
- **State modification or corruption due to malicious messages** - Ill-formatted messages are discarded and
|
||
well-formatted messages require authentication and authorization.
|
||
|
||
- **Non-server members accessing raw data** - All servers must join the cluster (with proper authentication and
|
||
authorization) to begin participating in Raft. Raft data is transmitted over TLS.
|
||
|
||
- **Denial of Service against a node** - DoS attacks against a node should not compromise the security stance of
|
||
the software.
|
||
|
||
- **Connect-based Service-to-Service communication** - Communications between two Connect-enabled services (natively or
|
||
by proxy) should be secure from eavesdropping and provide authentication. This is achieved via mutual TLS.
|
||
|
||
The following are not part of the threat model for server agents:
|
||
|
||
- **Access (read or write) to the Consul data directory** - All Consul servers, including non-leaders, persist the full
|
||
set of Consul state to this directory. The data includes all KV, service registrations, ACL tokens, Connect CA
|
||
configuration, and more. Any read or write to this directory allows an attacker to access and tamper with that data.
|
||
|
||
- **Access (read or write) to the Consul configuration directory** - Consul configuration can enable or disable the ACL
|
||
system, modify data directory paths, and more. Any read or write of this directory allows an attacker to reconfigure
|
||
many aspects of Consul. By disabling the ACL system, this may give an attacker access to all Consul data.
|
||
|
||
- **Memory access to a running Consul server agent** - If an attacker is able to inspect the memory state of a running
|
||
Consul server agent the confidentiality of almost all Consul data may be compromised. If you're using an external
|
||
Connect CA, the root private key material is never available to the Consul process and can be considered safe. Service
|
||
Connect TLS certificates should be considered compromised; they are never persisted by server agents but do exist
|
||
in-memory during at least the duration of a Sign request.
|
||
|
||
The following are not part of the threat model for client agents:
|
||
|
||
- **Access (read or write) to the Consul data directory** - Consul clients will use the data directory to cache local
|
||
state. This includes local services, associated ACL tokens, Connect TLS certificates, and more. Read or write access
|
||
to this directory will allow an attacker to access this data. This data is typically a smaller subset of the full data
|
||
of the cluster.
|
||
- **Access (read or write) to the Consul configuration directory** - Consul client configuration files contain the
|
||
address and port information of services, default ACL tokens for the agent, and more. Access to Consul configuration
|
||
could enable an attacker to change the port of a service to a malicious port, register new services, and more.
|
||
Further, some service definitions have ACL tokens attached that could be used cluster-wide to impersonate that
|
||
service. An attacker cannot change cluster-wide configurations such as disabling the ACL system.
|
||
|
||
- **Memory access to a running Consul client agent** - The blast radius of this is much smaller than a server agent but
|
||
the confidentiality of a subset of data can still be compromised. Particularly, any data requested against the agent's
|
||
API including services, KV, and Connect information may be compromised. If a particular set of data on the server was
|
||
never requested by the agent, it never enters the agent's memory since replication only exists between servers. An
|
||
attacker could also potentially extract ACL tokens used for service registration on this agent, since the tokens must
|
||
be stored in-memory alongside the registered service.
|
||
|
||
- **Network access to a local Connect proxy or service** - Communications between a service and a Connect-aware proxy
|
||
are generally unencrypted and must happen over a trusted network. This is typically a loopback device. This requires
|
||
that other processes on the same machine are trusted, or more complex isolation mechanisms are used such as network
|
||
namespaces. This also requires that external processes cannot communicate to the Connect service or proxy (except on
|
||
the inbound port). Therefore, non-native Connect applications should only bind to non-public addresses.
|
||
|
||
- **Improperly Implemented Connect proxy or service** - A Connect proxy or natively integrated service must correctly
|
||
serve a valid leaf certificate, verify the inbound TLS client certificate, and call the Consul agent-local authorized
|
||
endpoint. If any of this isn't performed correctly, the proxy or service may allow unauthenticated or unauthorized
|
||
connections.
|
||
|
||
#### Internal Threats
|
||
|
||
- **Operator** - A malicious internal Consul operator with a valid mTLS certificate and ACL token may still be a threat
|
||
to your cluster in certain situations, especially in multi-team deployments. They may accidentally or intentionally
|
||
abuse access to Consul components which can help be protected against using Namespace, and Sentinel policies.
|
||
|
||
- **Application** - A malicious internal application, such as a compromised third-party dependency with access to a
|
||
Consul agent, along with the TLS certificate or ACL token used by the local agent, could effectively do anything the
|
||
token permits. Consider enabling HTTPS for the local Consul agent API, enforcing full mutual TLS verification,
|
||
segmenting services using namespaces, as well as configuring OS users, groups, and file permissions to build a defense-in-depth approach.
|
||
|
||
- **RPC** - Malicious actors with access to a Consul agent RPC endpoint may be able to impersonate Consul server agents
|
||
if mTLS is not properly configured to verify the client TLS certificate identity. Consul should also have ACLs enabled
|
||
with a default policy explicitly set to deny to require authorization.
|
||
|
||
- **HTTP** - Malicious actors with access to a Consul agent HTTP(S) endpoint may be able to impersonate the agent’s
|
||
configured identity, and extract information from Consul when ACLs are disabled.
|
||
|
||
- **DNS** - Malicious actors with access to a Consul agent DNS endpoint may be able to extract service catalog
|
||
information. Gossip - Malicious actors with access to a Consul agent Serf gossip endpoint may be able to impersonate
|
||
agents within a datacenter. Gossip encryption should be enabled, with a regularly rotated gossip key.
|
||
|
||
- **Proxy (xDS)** - Malicious actors with access to a Consul agent xDS endpoint may be able to extract Envoy service
|
||
information. When ACLs and HTTPS are enabled, the gRPC endpoint serving up the xDS service requires (m)TLS and a
|
||
valid ACL token.
|
||
|
||
#### External Threats
|
||
|
||
- **Agents** - External access to the Consul agent’s various network endpoints should be considered including the
|
||
gossip, HTTP, RPC, and gRPC ports. Furthermore, access through other services like SSH or `exec` functionality in
|
||
orchestration systems such as Nomad and Kubernetes may expose unencrypted information persisted to disk including
|
||
TLS certificates or ACL tokens. Access to the Consul agent directory is explicitly outside the scope of Consul’s
|
||
threat model and should only be exposed to authenticated and authorized users.
|
||
|
||
- **Gateways** - Consul supports a variety of [gateways](/docs/connect/gateways) to allow traffic in-and-out of the
|
||
service mesh to support a variety of workloads. When using an internet-exposed gateway, you should be sure to harden
|
||
your Consul agent and host configurations. In most configurations, ACLS, gossip encryption, and mTLS should be
|
||
enforced. If an [escape hatch override](https://www.consul.io/docs/connect/proxies/envoy#escape-hatch-overrides) is
|
||
required, the proxy configuration should be audited to ensure security configurations remain intact, and do not
|
||
violate Consul’s security model.
|