2015-02-19 11:03:02 -08:00
|
|
|
|
---
|
2020-04-07 14:55:19 -04:00
|
|
|
|
layout: docs
|
|
|
|
|
page_title: Upgrading Specific Versions
|
2020-04-13 14:40:26 -04:00
|
|
|
|
sidebar_title: Specific Version Details
|
2020-04-07 14:55:19 -04:00
|
|
|
|
description: >-
|
|
|
|
|
Specific versions of Consul may have additional information about the upgrade
|
|
|
|
|
process beyond the standard flow.
|
2015-02-19 11:03:02 -08:00
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
# Upgrading Specific Versions
|
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
The [upgrading page](/docs/upgrading) covers the details of doing a
|
2019-05-15 10:49:41 -05:00
|
|
|
|
standard upgrade. However, specific versions of Consul may have more details
|
|
|
|
|
provided for their upgrades as a result of new features or changed behavior.
|
|
|
|
|
This page is used to document those details separately from the standard
|
|
|
|
|
upgrade flow.
|
2015-02-19 11:03:02 -08:00
|
|
|
|
|
2020-05-29 16:16:03 -05:00
|
|
|
|
## Consul 1.8.0
|
|
|
|
|
|
|
|
|
|
The [`acl_enforce_version_8`](/docs/agent/options#acl_enforce_version_8)
|
|
|
|
|
configuration has been removed (with version 8 ACL support by being on by
|
|
|
|
|
default).
|
|
|
|
|
|
2019-12-09 21:26:41 -05:00
|
|
|
|
## Consul 1.7.0
|
|
|
|
|
|
2020-04-06 16:27:35 -04:00
|
|
|
|
Consul 1.7.0 contains three major changes that impact upgrades:
|
|
|
|
|
[stricter JSON decoding](#stricter-json-decoding), [modified DNS outputs](#dns-ptr-record-output),
|
2020-03-05 13:04:04 -07:00
|
|
|
|
and [backward-incompatible Session API changes](#session-api).
|
|
|
|
|
|
|
|
|
|
### Session API
|
2020-04-06 16:27:35 -04:00
|
|
|
|
|
|
|
|
|
Consul 1.7.0 introduced a backwards incompatible change to the Session API.
|
2020-03-05 13:04:04 -07:00
|
|
|
|
Queries to view or renew sessions from agents on earlier versions will be rejected.
|
|
|
|
|
This impacts features and products including: Vault, the Enterprise snapshot agent, and locks.
|
|
|
|
|
|
2020-04-06 16:27:35 -04:00
|
|
|
|
The issue occurs when clients are still running 1.6.4 or earlier but servers have been upgraded to 1.7.0 or 1.7.1.
|
2020-03-05 13:04:04 -07:00
|
|
|
|
For this reason, we recommend you upgrade directly to 1.7.2 when it is available as it will include a fix for this issue.
|
2019-12-09 21:26:41 -05:00
|
|
|
|
|
|
|
|
|
### Stricter JSON Decoding
|
|
|
|
|
|
|
|
|
|
The HTTP API will now return 400 status codes with a textual error when unknown fields
|
|
|
|
|
are present in the payload of a request. Previously, Consul would simply ignore the
|
|
|
|
|
unknown fields. You will need to ensure that your API usage only uses supported
|
|
|
|
|
fields which are those documented in the example payloads in the API documentation.
|
|
|
|
|
|
|
|
|
|
### DNS PTR Record Output
|
|
|
|
|
|
|
|
|
|
Consul will now return the canonical service name in response to PTR queries. For OSS users the
|
|
|
|
|
change is that the datacenter will be present where it was not before. For Consul Enterprise
|
|
|
|
|
users, both the datacenter and the services namespace will be present. For example, where a
|
|
|
|
|
PTR record would previously have contained `web.service.consul`, it will now be `web.service.dc1.consul`
|
|
|
|
|
in OSS or `web.service.ns1.dc1.consul` for Enterprise.
|
|
|
|
|
|
2020-02-10 10:01:15 -08:00
|
|
|
|
### Telemetry: semantics of `consul.rpc.query` changed, see `consul.rpc.queries_blocking`
|
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
Consul has changed the semantics of query counts in its [telemetry](/docs/agent/telemetry#metrics-reference).
|
2020-04-06 16:27:35 -04:00
|
|
|
|
`consul.rpc.query` now only increments on the _start_ of a query (blocking or non-blocking), whereas before it would
|
2020-03-10 13:00:49 -04:00
|
|
|
|
measure when blocking queries polled for more data. The `consul.rpc.queries_blocking` gauge has been added
|
2020-04-06 16:27:35 -04:00
|
|
|
|
to more precisely capture the view of _active_ blocking queries.
|
2020-02-10 10:01:15 -08:00
|
|
|
|
|
2020-03-10 13:00:49 -04:00
|
|
|
|
### Vault: default `http_max_conns_per_client` too low to run Vault properly
|
2020-02-13 20:42:36 +01:00
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
Consul 1.7.0 introduced [limiting of connections per client](/docs/agent/options#http_max_conns_per_client). The default value
|
2020-03-10 13:00:49 -04:00
|
|
|
|
was 100, but Vault could use up to 128, which caused problems. If you want to use Vault with Consul 1.7.0, you should change the value to 200.
|
|
|
|
|
Starting with Consul 1.7.1 this is the new default.
|
2020-02-13 20:42:36 +01:00
|
|
|
|
|
|
|
|
|
## Consul 1.6.3
|
|
|
|
|
|
2020-03-10 13:00:49 -04:00
|
|
|
|
### Vault: default `http_max_conns_per_client` too low to run Vault properly
|
2020-02-13 20:42:36 +01:00
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
Consul 1.6.3 introduced [limiting of connections per client](/docs/agent/options#http_max_conns_per_client). The default value
|
2020-03-10 13:00:49 -04:00
|
|
|
|
was 100, but Vault could use up to 128, which caused problems. If you want to use Vault with Consul 1.6.3, you should change the value to 200.
|
|
|
|
|
Starting with Consul 1.6.4 this is the new default.
|
2020-02-13 20:42:36 +01:00
|
|
|
|
|
2019-09-27 10:52:47 -04:00
|
|
|
|
## Consul 1.6.0
|
|
|
|
|
|
|
|
|
|
#### Removal of Deprecated Features
|
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
Managed proxies (which have been [deprecated](/docs/connect/proxies/managed-deprecated)
|
|
|
|
|
since Consul 1.3.0) have now been [removed](/docs/connect/proxies). Before
|
2019-09-27 10:52:47 -04:00
|
|
|
|
upgrading, you will need to migrate any managed proxy usage to [sidecar service
|
2020-04-09 19:46:54 -04:00
|
|
|
|
registrations](/docs/connect/registration/sidecar-service).
|
2019-09-27 10:52:47 -04:00
|
|
|
|
|
2018-11-14 15:40:02 +00:00
|
|
|
|
## Consul 1.4.0
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
There are two major features in Consul 1.4.0 that may impact upgrades: a [new
|
|
|
|
|
ACL system](#acl-upgrade) and [multi-datacenter support for
|
|
|
|
|
Connect](#connect-multi-datacenter) in the Enterprise version.
|
2018-11-14 15:40:02 +00:00
|
|
|
|
|
|
|
|
|
### ACL Upgrade
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
Consul 1.4.0 includes a [new ACL
|
2020-09-10 13:32:06 -04:00
|
|
|
|
system](https://learn.hashicorp.com/tutorials/consul/access-control-setup-production)
|
2019-05-15 10:49:41 -05:00
|
|
|
|
that is designed to have a smooth upgrade path but requires care to upgrade
|
|
|
|
|
components in the right order.
|
2018-11-14 15:40:02 +00:00
|
|
|
|
|
|
|
|
|
**Note:** As with most major version upgrades, you cannot downgrade once the
|
|
|
|
|
upgrade to 1.4.0 is complete as it adds new state to the raft store. As always
|
|
|
|
|
it is _strongly_ recommended that you test the upgrade first outside of
|
|
|
|
|
production and ensure you take backup snapshots of all datacenters before
|
|
|
|
|
upgrading.
|
|
|
|
|
|
|
|
|
|
#### Primary Datacenter
|
|
|
|
|
|
|
|
|
|
The "ACL datacenter" in 1.3.x and earlier is now referred to as the "Primary
|
|
|
|
|
datacenter". All configuration is backwards compatible and shouldn't need to
|
|
|
|
|
change prior to upgrade although it's strongly recommended to migrate ACL
|
|
|
|
|
configuration to the new syntax soon after upgrade. This includes moving to
|
|
|
|
|
`primary_datacenter` rather than `acl_datacenter` and `acl_*` to the new [ACL
|
2020-04-09 19:46:54 -04:00
|
|
|
|
block](/docs/agent/options#acl).
|
2018-11-14 15:40:02 +00:00
|
|
|
|
|
|
|
|
|
Datacenters can be upgraded in any order although secondaries will remain in
|
|
|
|
|
[Legacy ACL mode](#legacy-acl-mode) until the primary datacenter is fully
|
2019-06-24 14:25:58 -07:00
|
|
|
|
upgraded.
|
2018-11-14 15:40:02 +00:00
|
|
|
|
|
|
|
|
|
Each datacenter should follow the [standard rolling upgrade
|
2020-04-09 19:46:54 -04:00
|
|
|
|
procedure](/docs/upgrading#standard-upgrades).
|
2018-11-14 15:40:02 +00:00
|
|
|
|
|
|
|
|
|
#### Legacy ACL Mode
|
|
|
|
|
|
|
|
|
|
When a 1.4.0 server first starts, it runs in "Legacy ACL mode". In this mode,
|
|
|
|
|
bootstrap requests and new ACL APIs will not be functional yet and will return
|
|
|
|
|
an error. The server advertises it's ability to support 1.4.0 ACLs via gossip
|
|
|
|
|
and waits.
|
|
|
|
|
|
|
|
|
|
In the primary datacenter, the servers all wait in legacy ACL mode until they
|
|
|
|
|
see every server in the primary datacenter advertise 1.4.0 ACL support. Once
|
|
|
|
|
this happens, the leader will complete the transition out of "legacy ACL mode"
|
|
|
|
|
and write this into the state so future restarts don't need to go through the
|
|
|
|
|
same transition.
|
|
|
|
|
|
|
|
|
|
In a secondary datacenter, the same process happens except that servers
|
|
|
|
|
_additionally_ wait for all servers in the primary datacenter making it safe to
|
|
|
|
|
upgrade datacenters in any order.
|
|
|
|
|
|
|
|
|
|
It should be noted that even if you are not upgrading, starting a brand new
|
|
|
|
|
1.4.0 cluster will transition through legacy ACL mode so you may be unable to
|
|
|
|
|
bootstrap ACLs until all the expected servers are up and healthy.
|
|
|
|
|
|
|
|
|
|
#### Legacy Token Accessor Migration
|
|
|
|
|
|
|
|
|
|
As soon as all servers in the primary datacenter have been upgraded to 1.4.0,
|
|
|
|
|
the leader will begin the process of creating new accessor IDs for all existing
|
|
|
|
|
ACL tokens.
|
|
|
|
|
|
|
|
|
|
This process completes in the background and is rate limited to ensure it
|
|
|
|
|
doesn't overload the leader. It completes upgrades in batches of 128 tokens and
|
|
|
|
|
will not upgrade more than one batch per second so on a cluster with 10,000
|
|
|
|
|
tokens, this may take several minutes.
|
|
|
|
|
|
|
|
|
|
While this is happening both old and new ACLs will work correctly with the
|
2020-04-09 19:46:54 -04:00
|
|
|
|
caveat that new ACL [Token APIs](/api/acl/tokens) may not return an
|
2018-11-14 15:40:02 +00:00
|
|
|
|
accessor ID for legacy tokens that are not yet migrated.
|
|
|
|
|
|
|
|
|
|
#### Migrating Existing ACLs
|
|
|
|
|
|
|
|
|
|
New ACL policies have slightly different syntax designed to fix some
|
|
|
|
|
shortcomings in old ACL syntax. During and after the upgrade process, any old
|
|
|
|
|
ACL tokens will continue to work and grant exactly the same level of access.
|
|
|
|
|
|
|
|
|
|
After upgrade, it is still possible to create "legacy" tokens using the existing
|
|
|
|
|
API so existing integrations that create tokens (e.g. Vault) will continue to
|
|
|
|
|
work. The "legacy" tokens generated though will not be able to take advantage of
|
|
|
|
|
new policy features. It's recommended that you complete migration of all tokens
|
|
|
|
|
as soon as possible after upgrade, as well as updating any integrations to work
|
2020-04-09 19:46:54 -04:00
|
|
|
|
with the the new ACL [Token](/api/acl/tokens) and
|
|
|
|
|
[Policy](/api/acl/policies) APIs.
|
2018-11-14 15:40:02 +00:00
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
More complete details on how to upgrade "legacy" tokens is available [here](/docs/acl/acl-migrate-tokens).
|
2018-11-14 15:40:02 +00:00
|
|
|
|
|
|
|
|
|
### Connect Multi-datacenter
|
|
|
|
|
|
|
|
|
|
This only applies to users upgrading from an older version of Consul Enterprise to Consul Enterprise 1.4.0 (all license types).
|
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
In addition, this upgrade will only affect clusters where [Connect is enabled](/docs/connect/configuration) on your servers before the migration.
|
2018-11-14 15:40:02 +00:00
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
Connect multi-datacenter uses the same primary/secondary approach as ACLs and
|
|
|
|
|
will use the same [primary_datacenter](#primary-datacenter). When a secondary
|
|
|
|
|
datacenter server restarts with 1.4.0 it will detect it is not the primary and
|
|
|
|
|
begin an automatic bootstrap of multi-datacenter CA federation.
|
|
|
|
|
|
|
|
|
|
Datacenters can be upgraded in either order; secondary datacenters will not
|
|
|
|
|
switch into multi-datacenter mode until all servers in both the secondary and
|
|
|
|
|
primary datacenter are detected to be running at least Consul 1.4.0. Secondary
|
|
|
|
|
datacenters monitor this periodically (every few minutes) and will
|
|
|
|
|
automatically upgrade Connect to use a federated Certificate Authority when
|
|
|
|
|
they do.
|
|
|
|
|
|
|
|
|
|
In general, migrating a Consul cluster from OSS to Enterprise will update the
|
|
|
|
|
CA to be federated automatically and without impact on Connect traffic. When
|
|
|
|
|
upgrading Consul Enterprise 1.3.x to Consul Enterprise 1.4.0 upgrades the CA
|
|
|
|
|
upgrade is seamless, however depending on the size of the cluster, _new_
|
|
|
|
|
connection attempts in the secondary datacenter might fail for a short window
|
|
|
|
|
(typically seconds) while the update is propagated due to the 1.3.x Beta
|
|
|
|
|
authorization endpoint validating originating cluster in a way that was not
|
|
|
|
|
fully forwards compatible with migrating between cluster trust domains. That
|
|
|
|
|
issue is fixed in 1.4.0 as part of General Availability.
|
|
|
|
|
|
|
|
|
|
Once migrated (typically a few seconds). Connect will use the primary
|
|
|
|
|
datacenter's Certificate Authority as the root of trust for all other
|
|
|
|
|
datacenters. CA migration or root key changes in the primary will now rotate
|
|
|
|
|
automatically and without loss of connectivity throughout all datacenters and
|
|
|
|
|
workloads.
|
|
|
|
|
|
|
|
|
|
For more information see [Connect
|
2020-04-09 19:20:00 -04:00
|
|
|
|
Multi-datacenter](/docs/enterprise/connect-multi-datacenter).
|
2018-11-14 15:40:02 +00:00
|
|
|
|
|
2018-11-14 21:39:59 +00:00
|
|
|
|
## Consul 1.3.0
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
This version added support for multiple tag filters in service discovery
|
|
|
|
|
queries, however it introduced a subtle bug where API calls to
|
|
|
|
|
`/catalog/service/:name?tag=<tag>` would ignore the tag filter _only during the
|
|
|
|
|
upgrade_. It only occurs when clients are still running 1.2.3 or earlier but
|
|
|
|
|
servers have been upgraded. The `/health/service/:name?tag=<tag>` endpoint and
|
|
|
|
|
DNS interface were _not_ affected.
|
2018-11-14 21:39:59 +00:00
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
For this reason, we recommend you upgrade directly to 1.3.1 which includes only
|
|
|
|
|
a fix for this issue.
|
2018-11-14 21:39:59 +00:00
|
|
|
|
|
2018-05-09 15:54:08 -07:00
|
|
|
|
## Consul 1.1.0
|
|
|
|
|
|
2018-05-10 15:33:30 -07:00
|
|
|
|
#### Removal of Deprecated Features
|
2018-05-09 15:54:08 -07:00
|
|
|
|
|
|
|
|
|
The following previously deprecated fields and config options have been removed:
|
|
|
|
|
|
2020-04-06 16:27:35 -04:00
|
|
|
|
- `CheckID` has been removed from config file check definitions (use `id` instead).
|
|
|
|
|
- `script` has been removed from config file check definitions (use `args` instead).
|
|
|
|
|
- `enableTagOverride` is no longer valid in service definitions (use `enable_tag_override` instead).
|
2020-04-09 19:46:54 -04:00
|
|
|
|
- The [deprecated set of metric names](/docs/upgrade-specific#metric-names-updated) (beginning with `consul.consul.`) has been removed
|
2020-04-06 16:27:35 -04:00
|
|
|
|
along with the `enable_deprecated_names` option from the metrics configuration.
|
2018-05-09 15:54:08 -07:00
|
|
|
|
|
2018-05-11 10:11:15 -05:00
|
|
|
|
#### New defaults for Raft Snapshot Creation
|
2019-05-15 10:49:41 -05:00
|
|
|
|
|
2018-05-11 10:11:15 -05:00
|
|
|
|
Consul 1.0.1 (and earlier versions of Consul) checked for raft snapshots every
|
|
|
|
|
5 seconds, and created new snapshots for every 8192 writes. These defaults cause
|
|
|
|
|
constant disk IO in large busy clusters. Consul 1.1.0 increases these to larger values,
|
2020-04-09 19:46:54 -04:00
|
|
|
|
and makes them tunable via the [raft_snapshot_interval](/docs/agent/options#_raft_snapshot_interval) and
|
|
|
|
|
[raft_snapshot_threshold](/docs/agent/options#_raft_snapshot_threshold) parameters. We recommend
|
2018-05-11 10:11:15 -05:00
|
|
|
|
keeping the new defaults. However, operators can go back to the old defaults by changing their
|
2020-04-09 19:46:54 -04:00
|
|
|
|
config if they prefer more frequent snapshots. See the documentation for [raft_snapshot_interval](/docs/agent/options#_raft_snapshot_interval)
|
|
|
|
|
and [raft_snapshot_threshold](/docs/agent/options#_raft_snapshot_threshold) to understand the trade-offs
|
2018-05-11 10:11:15 -05:00
|
|
|
|
when tuning these.
|
|
|
|
|
|
2018-10-11 11:38:55 +01:00
|
|
|
|
## Consul 1.0.7
|
|
|
|
|
|
|
|
|
|
When requesting a specific service (`/v1/health/:service` or
|
|
|
|
|
`/v1/catalog/:service` endpoints), the `X-Consul-Index` returned is now the
|
|
|
|
|
index at which that _specific service_ was last modified. In version 1.0.6 and
|
|
|
|
|
earlier the `X-Consul-Index` returned was the index at which _any_ service was
|
|
|
|
|
last modified. See [GH-3890](https://github.com/hashicorp/consul/issues/3890)
|
|
|
|
|
for more details.
|
|
|
|
|
|
|
|
|
|
During upgrades from 1.0.6 or lower to 1.0.7 or higher, watchers are likely to
|
|
|
|
|
see `X-Consul-Index` for these endpoints decrease between blocking calls.
|
|
|
|
|
|
|
|
|
|
Consul’s watch feature and `consul-template` should gracefully handle this case.
|
|
|
|
|
Other tools relying on blocking service or health queries are also likely to
|
|
|
|
|
work; some may require a restart. It is possible external tools could break and
|
|
|
|
|
either stop working or continually re-request data without blocking if they
|
|
|
|
|
have assumed indexes can never decrease or be reset and/or persist index
|
|
|
|
|
values. Please test any blocking query integrations in a controlled environment
|
|
|
|
|
before proceeding.
|
|
|
|
|
|
2017-11-20 12:01:23 -08:00
|
|
|
|
## Consul 1.0.1
|
|
|
|
|
|
|
|
|
|
#### Carefully Check and Remove Stale Servers During Rolling Upgrades
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
Consul 1.0 (and earlier versions of Consul when running with [Raft protocol
|
2020-04-09 19:46:54 -04:00
|
|
|
|
3](/docs/agent/options#_raft_protocol) had an issue where performing
|
2019-05-15 10:49:41 -05:00
|
|
|
|
rolling updates of Consul servers could result in an outage from old servers
|
|
|
|
|
remaining in the cluster.
|
2020-09-10 13:32:06 -04:00
|
|
|
|
[Autopilot](https://learn.hashicorp.com/tutorials/consul/autopilot-datacenter-operations)
|
2019-05-15 10:49:41 -05:00
|
|
|
|
would normally remove old servers when new ones come online, but it was also
|
|
|
|
|
waiting to promote servers to voters in pairs to maintain an odd quorum size.
|
|
|
|
|
The pairwise promotion feature was removed so that servers become voters as
|
|
|
|
|
soon as they are stable, allowing Autopilot to remove old servers in a safer
|
|
|
|
|
way.
|
|
|
|
|
|
|
|
|
|
When upgrading from Consul 1.0, you may need to manually
|
2020-04-09 19:46:54 -04:00
|
|
|
|
[force-leave](/docs/commands/force-leave) old servers as part of a rolling
|
2019-05-15 10:49:41 -05:00
|
|
|
|
update to Consul 1.0.1.
|
2017-11-20 12:01:23 -08:00
|
|
|
|
|
2017-10-13 16:46:36 -07:00
|
|
|
|
## Consul 1.0
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
Consul 1.0 has several important breaking changes that are documented here.
|
|
|
|
|
Please be sure to read over all the details here before upgrading.
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
|
|
|
|
#### Raft Protocol Now Defaults to 3
|
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
The [`-raft-protocol`](/docs/agent/options#_raft_protocol) default has
|
2019-05-15 10:49:41 -05:00
|
|
|
|
been changed from 2 to 3, enabling all
|
2020-09-10 13:32:06 -04:00
|
|
|
|
[Autopilot](https://learn.hashicorp.com/tutorials/consul/autopilot-datacenter-operations)
|
2019-05-15 10:49:41 -05:00
|
|
|
|
features by default.
|
|
|
|
|
|
|
|
|
|
Raft protocol version 3 requires Consul running 0.8.0 or newer on all servers
|
|
|
|
|
in order to work, so if you are upgrading with older servers in a cluster then
|
|
|
|
|
you will need to set this back to 2 in order to upgrade. See [Raft Protocol
|
|
|
|
|
Version
|
2020-04-09 19:46:54 -04:00
|
|
|
|
Compatibility](/docs/upgrade-specific#raft-protocol-version-compatibility)
|
2019-05-15 10:49:41 -05:00
|
|
|
|
for more details. Also the format of `peers.json` used for outage recovery is
|
2020-09-10 13:32:06 -04:00
|
|
|
|
different when running with the latest Raft protocol. Review [Manual Recovery
|
2019-05-15 10:49:41 -05:00
|
|
|
|
Using
|
2020-09-10 13:32:06 -04:00
|
|
|
|
peers.json](https://learn.hashicorp.com/tutorials/consul/recovery-outage#manual-recovery-using-peers-json)
|
2019-05-15 10:49:41 -05:00
|
|
|
|
for a description of the required format.
|
|
|
|
|
|
|
|
|
|
Please note that the Raft protocol is different from Consul's internal protocol
|
2020-04-09 19:46:54 -04:00
|
|
|
|
as described on the [Protocol Compatibility Promise](/docs/compatibility)
|
2019-05-15 10:49:41 -05:00
|
|
|
|
page, and as is shown in commands like `consul members` and `consul version`.
|
2020-04-06 16:27:35 -04:00
|
|
|
|
To see the version of the Raft protocol in use on each server, use the `consul operator raft list-peers` command.
|
2019-05-15 10:49:41 -05:00
|
|
|
|
|
|
|
|
|
The easiest way to upgrade servers is to have each server leave the cluster,
|
|
|
|
|
upgrade its Consul version, and then add it back. Make sure the new server
|
|
|
|
|
joins successfully and that the cluster is stable before rolling the upgrade
|
|
|
|
|
forward to the next server. It's also possible to stand up a new set of
|
|
|
|
|
servers, and then slowly stand down each of the older servers in a similar
|
|
|
|
|
fashion.
|
|
|
|
|
|
|
|
|
|
When using Raft protocol version 3, servers are identified by their
|
2020-04-09 19:46:54 -04:00
|
|
|
|
[`-node-id`](/docs/agent/options#_node_id) instead of their IP address
|
2019-05-15 10:49:41 -05:00
|
|
|
|
when Consul makes changes to its internal Raft quorum configuration. This means
|
|
|
|
|
that once a cluster has been upgraded with servers all running Raft protocol
|
|
|
|
|
version 3, it will no longer allow servers running any older Raft protocol
|
|
|
|
|
versions to be added. If running a single Consul server, restarting it in-place
|
|
|
|
|
will result in that server not being able to elect itself as a leader. To avoid
|
|
|
|
|
this, either set the Raft protocol back to 2, or use [Manual Recovery Using
|
2020-09-10 13:32:06 -04:00
|
|
|
|
peers.json](https://learn.hashicorp.com/tutorials/consul/recovery-outage#manual-recovery-using-peers-json)
|
2019-05-15 10:49:41 -05:00
|
|
|
|
to map the server to its node ID in the Raft quorum configuration.
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
|
|
|
|
#### Config Files Require an Extension
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
As part of supporting the [HCL](https://github.com/hashicorp/hcl#syntax) format
|
|
|
|
|
for Consul's config files, an `.hcl` or `.json` extension is required for all
|
|
|
|
|
config files loaded by Consul, even when using the
|
2020-04-09 19:46:54 -04:00
|
|
|
|
[`-config-file`](/docs/agent/options#_config_file) argument to specify a
|
2019-05-15 10:49:41 -05:00
|
|
|
|
file directly.
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
2020-02-18 16:32:07 +03:00
|
|
|
|
#### Service Definition Parameter Case changed
|
|
|
|
|
|
|
|
|
|
All config file formats now require snake_case fields, so all CamelCased parameter
|
|
|
|
|
names should be changed before upgrading.
|
2020-04-09 19:46:54 -04:00
|
|
|
|
See [Service Definition Parameter Case](/docs/agent/services#service-definition-parameter-case) documentation for details.
|
2020-02-18 16:32:07 +03:00
|
|
|
|
|
2017-10-13 16:46:36 -07:00
|
|
|
|
#### Deprecated Options Have Been Removed
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
All of Consul's previously deprecated command line flags and config options
|
|
|
|
|
have been removed, so these will need to be mapped to their equivalents before
|
|
|
|
|
upgrading. Here's the complete list of removed options and their equivalents:
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
| Removed Option | Equivalent |
|
|
|
|
|
| ------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
|
|
|
| `-dc` | [`-datacenter`](/docs/agent/options#_datacenter) |
|
2020-04-13 17:24:10 -04:00
|
|
|
|
| `-retry-join-azure-tag-name` | [`-retry-join`](/docs/agent/options#_retry_join) |
|
|
|
|
|
| `-retry-join-azure-tag-value` | [`-retry-join`](/docs/agent/options#_retry_join) |
|
|
|
|
|
| `-retry-join-ec2-region` | [`-retry-join`](/docs/agent/options#_retry_join) |
|
|
|
|
|
| `-retry-join-ec2-tag-key` | [`-retry-join`](/docs/agent/options#_retry_join) |
|
|
|
|
|
| `-retry-join-ec2-tag-value` | [`-retry-join`](/docs/agent/options#_retry_join) |
|
|
|
|
|
| `-retry-join-gce-credentials-file` | [`-retry-join`](/docs/agent/options#_retry_join) |
|
|
|
|
|
| `-retry-join-gce-project-name` | [`-retry-join`](/docs/agent/options#_retry_join) |
|
|
|
|
|
| `-retry-join-gce-tag-name` | [`-retry-join`](/docs/agent/options#_retry_join) |
|
|
|
|
|
| `-retry-join-gce-zone-pattern` | [`-retry-join`](/docs/agent/options#_retry_join) |
|
2020-04-09 19:46:54 -04:00
|
|
|
|
| `addresses.rpc` | None, the RPC server for CLI commands is no longer supported. |
|
|
|
|
|
| `advertise_addrs` | [`ports`](/docs/agent/options#ports) with [`advertise_addr`](/docs/agent/options#advertise_addr) and/or [`advertise_addr_wan`](/docs/agent/options#advertise_addr_wan) |
|
|
|
|
|
| `dogstatsd_addr` | [`telemetry.dogstatsd_addr`](/docs/agent/options#telemetry-dogstatsd_addr) |
|
|
|
|
|
| `dogstatsd_tags` | [`telemetry.dogstatsd_tags`](/docs/agent/options#telemetry-dogstatsd_tags) |
|
|
|
|
|
| `http_api_response_headers` | [`http_config.response_headers`](/docs/agent/options#response_headers) |
|
|
|
|
|
| `ports.rpc` | None, the RPC server for CLI commands is no longer supported. |
|
|
|
|
|
| `recursor` | [`recursors`](https://github.com/hashicorp/consul/blob/master/website/pages/docs/agent/options.mdx#recursors) |
|
2020-04-13 17:24:10 -04:00
|
|
|
|
| `retry_join_azure` | [`-retry-join`](/docs/agent/options#_retry_join) |
|
|
|
|
|
| `retry_join_ec2` | [`-retry-join`](/docs/agent/options#_retry_join) |
|
|
|
|
|
| `retry_join_gce` | [`-retry-join`](/docs/agent/options#_retry_join) |
|
2020-04-09 19:46:54 -04:00
|
|
|
|
| `statsd_addr` | [`telemetry.statsd_address`](https://github.com/hashicorp/consul/blob/master/website/pages/docs/agent/options.mdx#telemetry-statsd_address) |
|
|
|
|
|
| `statsite_addr` | [`telemetry.statsite_address`](https://github.com/hashicorp/consul/blob/master/website/pages/docs/agent/options.mdx#telemetry-statsite_address) |
|
|
|
|
|
| `statsite_prefix` | [`telemetry.metrics_prefix`](/docs/agent/options#telemetry-metrics_prefix) |
|
|
|
|
|
| `telemetry.statsite_prefix` | [`telemetry.metrics_prefix`](/docs/agent/options#telemetry-metrics_prefix) |
|
|
|
|
|
| (service definitions) `serviceid` | [`service_id`](/docs/agent/services) |
|
|
|
|
|
| (service definitions) `dockercontainerid` | [`docker_container_id`](/docs/agent/services) |
|
|
|
|
|
| (service definitions) `tlsskipverify` | [`tls_skip_verify`](/docs/agent/services) |
|
|
|
|
|
| (service definitions) `deregistercriticalserviceafter` | [`deregister_critical_service_after`](/docs/agent/services) |
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
|
|
|
|
#### `statsite_prefix` Renamed to `metrics_prefix`
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
Since the `statsite_prefix` configuration option applied to all telemetry
|
|
|
|
|
providers, `statsite_prefix` was renamed to
|
2020-04-09 19:46:54 -04:00
|
|
|
|
[`metrics_prefix`](/docs/agent/options#telemetry-metrics_prefix).
|
2019-05-15 10:49:41 -05:00
|
|
|
|
Configuration files will need to be updated when upgrading to this version of
|
|
|
|
|
Consul.
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
|
|
|
|
#### `advertise_addrs` Removed
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
This configuration option was removed since it was redundant with
|
|
|
|
|
`advertise_addr` and `advertise_addr_wan` in combination with `ports` and also
|
|
|
|
|
wrongly stated that you could configure both host and port.
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
|
|
|
|
#### Escaping Behavior Changed for go-discover Configs
|
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
The format for [`-retry-join`](/docs/agent/options#retry-join) and
|
|
|
|
|
[`-retry-join-wan`](/docs/agent/options#retry-join-wan) values that use
|
2019-05-15 10:49:41 -05:00
|
|
|
|
[go-discover](https://github.com/hashicorp/go-discover) cloud auto joining has
|
|
|
|
|
changed. Values in `key=val` sequences must no longer be URL encoded and can be
|
|
|
|
|
provided as literals as long as they do not contain spaces, backslashes `\` or
|
|
|
|
|
double quotes `"`. If values contain these characters then use double quotes as
|
|
|
|
|
in `"some key"="some value"`. Special characters within a double quoted string
|
|
|
|
|
can be escaped with a backslash `\`.
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
|
|
|
|
#### HTTP Verbs are Enforced in Many HTTP APIs
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
Many endpoints in the HTTP API that previously took any HTTP verb now check for
|
|
|
|
|
specific HTTP verbs and enforce them. This may break clients relying on the old
|
|
|
|
|
behavior. Here's the complete list of updated endpoints and required HTTP
|
|
|
|
|
verbs:
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
2020-04-06 16:27:35 -04:00
|
|
|
|
| Endpoint | Required HTTP Verb |
|
|
|
|
|
| ------------------------------- | ------------------ |
|
|
|
|
|
| /v1/acl/info | GET |
|
|
|
|
|
| /v1/acl/list | GET |
|
|
|
|
|
| /v1/acl/replication | GET |
|
|
|
|
|
| /v1/agent/check/deregister | PUT |
|
|
|
|
|
| /v1/agent/check/fail | PUT |
|
|
|
|
|
| /v1/agent/check/pass | PUT |
|
|
|
|
|
| /v1/agent/check/register | PUT |
|
|
|
|
|
| /v1/agent/check/warn | PUT |
|
|
|
|
|
| /v1/agent/checks | GET |
|
|
|
|
|
| /v1/agent/force-leave | PUT |
|
|
|
|
|
| /v1/agent/join | PUT |
|
|
|
|
|
| /v1/agent/members | GET |
|
|
|
|
|
| /v1/agent/metrics | GET |
|
|
|
|
|
| /v1/agent/self | GET |
|
|
|
|
|
| /v1/agent/service/register | PUT |
|
|
|
|
|
| /v1/agent/service/deregister | PUT |
|
|
|
|
|
| /v1/agent/services | GET |
|
|
|
|
|
| /v1/catalog/datacenters | GET |
|
|
|
|
|
| /v1/catalog/deregister | PUT |
|
|
|
|
|
| /v1/catalog/node | GET |
|
|
|
|
|
| /v1/catalog/nodes | GET |
|
|
|
|
|
| /v1/catalog/register | PUT |
|
|
|
|
|
| /v1/catalog/service | GET |
|
|
|
|
|
| /v1/catalog/services | GET |
|
|
|
|
|
| /v1/coordinate/datacenters | GET |
|
|
|
|
|
| /v1/coordinate/nodes | GET |
|
|
|
|
|
| /v1/health/checks | GET |
|
|
|
|
|
| /v1/health/node | GET |
|
|
|
|
|
| /v1/health/service | GET |
|
|
|
|
|
| /v1/health/state | GET |
|
|
|
|
|
| /v1/internal/ui/node | GET |
|
|
|
|
|
| /v1/internal/ui/nodes | GET |
|
|
|
|
|
| /v1/internal/ui/services | GET |
|
|
|
|
|
| /v1/session/info | GET |
|
|
|
|
|
| /v1/session/list | GET |
|
|
|
|
|
| /v1/session/node | GET |
|
|
|
|
|
| /v1/status/leader | GET |
|
|
|
|
|
| /v1/status/peers | GET |
|
|
|
|
|
| /v1/operator/area/:uuid/members | GET |
|
|
|
|
|
| /v1/operator/area/:uuid/join | PUT |
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
|
|
|
|
#### Unauthorized KV Requests Return 403
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
When ACLs are enabled, reading a key with an unauthorized token returns a 403.
|
|
|
|
|
This previously returned a 404 response.
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
|
|
|
|
#### Config Section of Agent Self Endpoint has Changed
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
The /v1/agent/self endpoint's `Config` section has often been in flux as it was
|
|
|
|
|
directly returning one of Consul's internal data structures. This configuration
|
|
|
|
|
structure has been moved under `DebugConfig`, and is documents as for debugging
|
|
|
|
|
use and subject to change, and a small set of elements of `Config` have been
|
|
|
|
|
maintained and documented. See [Read
|
2020-04-09 19:46:54 -04:00
|
|
|
|
Configuration](/api/agent#read-configuration) endpoint documentation for
|
2019-05-15 10:49:41 -05:00
|
|
|
|
details.
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
|
|
|
|
#### Deprecated `configtest` Command Removed
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
The `configtest` command was deprecated and has been superseded by the
|
|
|
|
|
`validate` command.
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
|
|
|
|
#### Undocumented Flags in `validate` Command Removed
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
The `validate` command supported the `-config-file` and `-config-dir` command
|
|
|
|
|
line flags but did not document them. This support has been removed since the
|
|
|
|
|
flags are not required.
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
|
|
|
|
#### Metric Names Updated
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
Metric names no longer start with `consul.consul`. To help with transitioning
|
|
|
|
|
dashboards and other metric consumers, the field `enable_deprecated_names` has
|
|
|
|
|
been added to the telemetry section of the config, which will enable metrics
|
|
|
|
|
with the old naming scheme to be sent alongside the new ones. The following
|
|
|
|
|
prefixes were affected:
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
2020-04-06 16:27:35 -04:00
|
|
|
|
| Prefix |
|
|
|
|
|
| ---------------------------- |
|
|
|
|
|
| consul.consul.acl |
|
|
|
|
|
| consul.consul.autopilot |
|
|
|
|
|
| consul.consul.catalog |
|
|
|
|
|
| consul.consul.fsm |
|
|
|
|
|
| consul.consul.health |
|
|
|
|
|
| consul.consul.http |
|
|
|
|
|
| consul.consul.kvs |
|
|
|
|
|
| consul.consul.leader |
|
2017-10-13 16:46:36 -07:00
|
|
|
|
| consul.consul.prepared-query |
|
2020-04-06 16:27:35 -04:00
|
|
|
|
| consul.consul.rpc |
|
|
|
|
|
| consul.consul.session |
|
|
|
|
|
| consul.consul.session_ttl |
|
|
|
|
|
| consul.consul.txn |
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
|
|
|
|
#### Checks Validated On Agent Startup
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
Consul agents now validate health check definitions in their configuration and
|
|
|
|
|
will fail at startup if any checks are invalid. In previous versions of Consul,
|
|
|
|
|
invalid health checks would get skipped.
|
2017-10-13 16:46:36 -07:00
|
|
|
|
|
2017-07-17 14:11:08 -07:00
|
|
|
|
## Consul 0.9.0
|
|
|
|
|
|
2017-07-20 14:48:45 -07:00
|
|
|
|
#### Script Checks Are Now Opt-In
|
2017-07-17 14:11:08 -07:00
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
A new [`enable_script_checks`](/docs/agent/options#_enable_script_checks)
|
2019-05-15 10:49:41 -05:00
|
|
|
|
configuration option was added, and defaults to `false`, meaning that in order
|
|
|
|
|
to allow an agent to run health checks that execute scripts, this will need to
|
|
|
|
|
be configured and set to `true`. This provides a safer out-of-the-box
|
|
|
|
|
configuration for Consul where operators must opt-in to allow script-based
|
|
|
|
|
health checks.
|
2017-07-17 14:11:08 -07:00
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
If your cluster uses script health checks please be sure to set this to `true`
|
|
|
|
|
as part of upgrading agents. If this is set to `true`, you should also enable
|
2020-09-10 13:32:06 -04:00
|
|
|
|
[ACLs](https://learn.hashicorp.com/tutorials/consul/access-control-setup-production)
|
2019-05-15 10:49:41 -05:00
|
|
|
|
to provide control over which users are allowed to register health checks that
|
|
|
|
|
could potentially execute scripts on the agent machines.
|
2017-07-18 07:11:59 -07:00
|
|
|
|
|
|
|
|
|
#### Web UI Is No Longer Released Separately
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
Consul releases will no longer include a `web_ui.zip` file with the compiled
|
|
|
|
|
web assets. These have been built in to the Consul binary since the 0.7.x
|
2020-04-09 19:46:54 -04:00
|
|
|
|
series and can be enabled with the [`-ui`](/docs/agent/options#_ui)
|
2019-05-15 10:49:41 -05:00
|
|
|
|
configuration option. These built-in web assets have always been identical to
|
|
|
|
|
the contents of the `web_ui.zip` file for each release. The
|
2020-04-09 19:46:54 -04:00
|
|
|
|
[`-ui-dir`](/docs/agent/options#_ui_dir) option is still available for
|
2019-05-15 10:49:41 -05:00
|
|
|
|
hosting customized versions of the web assets, but the vast majority of Consul
|
|
|
|
|
users can just use the built in web assets.
|
2017-07-17 14:11:08 -07:00
|
|
|
|
|
2017-02-24 18:10:46 -08:00
|
|
|
|
## Consul 0.8.0
|
|
|
|
|
|
2017-04-11 10:50:56 -07:00
|
|
|
|
#### Upgrade Current Cluster Leader Last
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
We identified a potential issue with Consul 0.8 that requires the current
|
|
|
|
|
cluster leader to be upgraded last when updating multiple servers. Please see
|
2017-04-11 10:50:56 -07:00
|
|
|
|
[this issue](https://github.com/hashicorp/consul/issues/2889) for more details.
|
|
|
|
|
|
2017-02-24 18:10:46 -08:00
|
|
|
|
#### Command-Line Interface RPC Deprecation
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
The RPC client interface has been removed. All CLI commands that used RPC and
|
|
|
|
|
the `-rpc-addr` flag to communicate with Consul have been converted to use the
|
|
|
|
|
HTTP API and the appropriate flags for it, and the `rpc` field has been removed
|
|
|
|
|
from the port and address binding configs. You will need to remove these fields
|
|
|
|
|
from your config files and update any scripts that passed a custom `-rpc-addr`
|
|
|
|
|
to the following commands:
|
2017-02-24 18:10:46 -08:00
|
|
|
|
|
2020-04-06 16:27:35 -04:00
|
|
|
|
- `force-leave`
|
|
|
|
|
- `info`
|
|
|
|
|
- `join`
|
|
|
|
|
- `keyring`
|
|
|
|
|
- `leave`
|
|
|
|
|
- `members`
|
|
|
|
|
- `monitor`
|
|
|
|
|
- `reload`
|
2017-02-24 18:10:46 -08:00
|
|
|
|
|
2017-03-24 17:45:24 -07:00
|
|
|
|
#### Version 8 ACLs Are Now Opt-Out
|
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
The [`acl_enforce_version_8`](/docs/agent/options#acl_enforce_version_8)
|
2019-05-15 10:49:41 -05:00
|
|
|
|
configuration now defaults to `true` to enable full version 8 ACL support by
|
|
|
|
|
default. If you are upgrading an existing cluster with ACLs enabled, you will
|
|
|
|
|
need to set this to `false` during the upgrade on **both Consul agents and
|
|
|
|
|
Consul servers**. Version 8 ACLs were also changed so that
|
2020-04-09 19:46:54 -04:00
|
|
|
|
[`acl_datacenter`](/docs/agent/options#acl_datacenter) must be set on
|
2019-05-15 10:49:41 -05:00
|
|
|
|
agents in order to enable the agent-side enforcement of ACLs. This makes for a
|
|
|
|
|
smoother experience in clusters where ACLs aren't enabled at all, but where the
|
|
|
|
|
agents would have to wait to contact a Consul server before learning that.
|
2017-03-24 17:45:24 -07:00
|
|
|
|
|
2017-03-30 09:25:12 -07:00
|
|
|
|
#### Remote Exec Is Now Opt-In
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
The default for
|
2020-04-09 19:46:54 -04:00
|
|
|
|
[`disable_remote_exec`](/docs/agent/options#disable_remote_exec) was
|
2019-05-15 10:49:41 -05:00
|
|
|
|
changed to "true", so now operators need to opt-in to having agents support
|
2020-04-09 19:46:54 -04:00
|
|
|
|
running commands remotely via [`consul exec`](/docs/commands/exec).
|
2017-03-30 09:25:12 -07:00
|
|
|
|
|
|
|
|
|
#### Raft Protocol Version Compatibility
|
2017-03-10 14:55:18 -08:00
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
When upgrading to Consul 0.8.0 from a version lower than 0.7.0, users will need
|
2020-04-09 19:46:54 -04:00
|
|
|
|
to set the [`-raft-protocol`](/docs/agent/options#_raft_protocol) option
|
2019-05-15 10:49:41 -05:00
|
|
|
|
to 1 in order to maintain backwards compatibility with the old servers during
|
2020-04-06 16:27:35 -04:00
|
|
|
|
the upgrade. After the servers have been migrated to version 0.8.0,
|
2019-05-15 10:49:41 -05:00
|
|
|
|
`-raft-protocol` can be moved up to 2 and the servers restarted to match the
|
|
|
|
|
default.
|
2017-03-10 14:55:18 -08:00
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
The Raft protocol must be stepped up in this way; only adjacent version numbers
|
|
|
|
|
are compatible (for example, version 1 cannot talk to version 3). Here is a
|
|
|
|
|
table of the Raft Protocol versions supported by each Consul version:
|
2017-03-10 14:55:18 -08:00
|
|
|
|
|
2020-04-07 19:56:08 -04:00
|
|
|
|
| Version | Supported Raft Protocols |
|
|
|
|
|
| --------------- | ------------------------ |
|
|
|
|
|
| 0.6 and earlier | 0 |
|
|
|
|
|
| 0.7 | 1 |
|
|
|
|
|
| 0.8 | 1, 2, 3 |
|
2017-03-10 14:55:18 -08:00
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
In order to enable all
|
2020-09-10 13:32:06 -04:00
|
|
|
|
[Autopilot](https://learn.hashicorp.com/tutorials/consul/autopilot-datacenter-operations)
|
|
|
|
|
features, all servers in a Consul datacenter must be running with Raft protocol
|
2019-05-15 10:49:41 -05:00
|
|
|
|
version 3 or later.
|
2017-03-10 15:19:50 -08:00
|
|
|
|
|
2016-11-08 12:12:57 -08:00
|
|
|
|
## Consul 0.7.1
|
|
|
|
|
|
|
|
|
|
#### Child Process Reaping
|
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
Child process reaping support has been removed, along with the `reap`
|
|
|
|
|
configuration option. Reaping is also done via
|
|
|
|
|
[dumb-init](https://github.com/Yelp/dumb-init) in the [Consul Docker
|
|
|
|
|
image](https://github.com/hashicorp/docker-consul), so removing it from Consul
|
|
|
|
|
itself simplifies the code and eases future maintenance for Consul. If you are
|
|
|
|
|
running Consul as PID 1 in a container you will need to arrange for a wrapper
|
|
|
|
|
process to reap child processes.
|
2016-11-08 12:12:57 -08:00
|
|
|
|
|
|
|
|
|
#### DNS Resiliency Defaults
|
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
The default for [`max_stale`](/docs/agent/options#max_stale) has been
|
2019-05-15 10:49:41 -05:00
|
|
|
|
increased from 5 seconds to a near-indefinite threshold (10 years) to allow DNS
|
|
|
|
|
queries to continue to be served in the event of a long outage with no leader.
|
|
|
|
|
A new telemetry counter was added at `consul.dns.stale_queries` to track when
|
|
|
|
|
agents serve DNS queries that are stale by more than 5 seconds.
|
2016-11-08 12:12:57 -08:00
|
|
|
|
|
2016-07-01 12:26:14 -07:00
|
|
|
|
## Consul 0.7
|
|
|
|
|
|
2016-08-16 15:10:52 -07:00
|
|
|
|
Consul version 0.7 is a very large release with many important changes. Changes
|
|
|
|
|
to be aware of during an upgrade are categorized below.
|
|
|
|
|
|
2016-09-01 00:22:09 -07:00
|
|
|
|
#### Performance Timing Defaults and Tuning
|
2016-08-24 17:33:53 -07:00
|
|
|
|
|
2019-05-15 10:49:41 -05:00
|
|
|
|
Consul 0.7 now defaults the DNS configuration to allow for stale queries by
|
2020-04-09 19:46:54 -04:00
|
|
|
|
defaulting [`allow_stale`](/docs/agent/options#allow_stale) to true for
|
2019-05-15 10:49:41 -05:00
|
|
|
|
better utilization of available servers. If you want to retain the previous
|
|
|
|
|
behavior, set the following configuration:
|
2016-08-30 13:40:43 -07:00
|
|
|
|
|
|
|
|
|
```javascript
|
|
|
|
|
{
|
|
|
|
|
"dns_config": {
|
|
|
|
|
"allow_stale": false
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Consul also 0.7 introduced support for tuning Raft performance using a new
|
2020-04-09 19:46:54 -04:00
|
|
|
|
[performance configuration block](/docs/agent/options#performance). Also,
|
2016-08-24 17:33:53 -07:00
|
|
|
|
the default Raft timing is set to a lower-performance mode suitable for
|
2020-04-09 19:46:54 -04:00
|
|
|
|
[minimal Consul servers](/docs/install/performance#minimum).
|
2016-08-24 17:33:53 -07:00
|
|
|
|
|
|
|
|
|
To continue to use the high-performance settings that were the default prior to
|
2019-05-15 10:49:41 -05:00
|
|
|
|
Consul 0.7 (recommended for production servers), add the following
|
|
|
|
|
configuration to all Consul servers when upgrading:
|
2016-08-24 17:33:53 -07:00
|
|
|
|
|
|
|
|
|
```javascript
|
|
|
|
|
{
|
|
|
|
|
"performance": {
|
|
|
|
|
"raft_multiplier": 1
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
See the [Server Performance](/docs/install/performance) guide for more details.
|
2016-08-24 17:33:53 -07:00
|
|
|
|
|
2016-09-01 00:22:09 -07:00
|
|
|
|
#### Leave-Related Configuration Defaults
|
2016-08-16 15:10:52 -07:00
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
The default behavior of [`leave_on_terminate`](/docs/agent/options#leave_on_terminate)
|
|
|
|
|
and [`skip_leave_on_interrupt`](/docs/agent/options#skip_leave_on_interrupt)
|
2016-09-01 00:22:09 -07:00
|
|
|
|
are now dependent on whether or not the agent is acting as a server or client:
|
|
|
|
|
|
2020-04-06 16:27:35 -04:00
|
|
|
|
- For servers, `leave_on_terminate` defaults to "false" and `skip_leave_on_interrupt`
|
|
|
|
|
defaults to "true".
|
2016-09-01 00:22:09 -07:00
|
|
|
|
|
2020-04-06 16:27:35 -04:00
|
|
|
|
- For clients, `leave_on_terminate` defaults to "true" and `skip_leave_on_interrupt`
|
|
|
|
|
defaults to "false".
|
2016-09-01 00:22:09 -07:00
|
|
|
|
|
|
|
|
|
These defaults are designed to be safer for servers so that you must explicitly
|
|
|
|
|
configure them to leave the cluster. This also results in a better experience for
|
|
|
|
|
clients, especially in cloud environments where they may be created and destroyed
|
|
|
|
|
often and users prefer not to wait for the 72 hour reap time for cleanup.
|
2016-08-16 15:10:52 -07:00
|
|
|
|
|
2016-08-09 18:10:04 -07:00
|
|
|
|
#### Dropped Support for Protocol Version 1
|
|
|
|
|
|
|
|
|
|
Consul version 0.7 dropped support for protocol version 1, which means it
|
|
|
|
|
is no longer compatible with versions of Consul prior to 0.3. You will need
|
|
|
|
|
to upgrade all agents to a newer version of Consul before upgrading to Consul
|
|
|
|
|
0.7.
|
|
|
|
|
|
|
|
|
|
#### Prepared Query Changes
|
|
|
|
|
|
2016-07-01 12:26:14 -07:00
|
|
|
|
Consul version 0.7 adds a feature which allows prepared queries to store a
|
2020-04-09 19:46:54 -04:00
|
|
|
|
[`Near` parameter](/api/query#near) in the query definition
|
2016-07-01 12:26:14 -07:00
|
|
|
|
itself. This feature enables using the distance sorting features of prepared
|
|
|
|
|
queries without explicitly providing the node to sort near in requests, but
|
|
|
|
|
requires the agent servicing a request to send additional information about
|
|
|
|
|
itself to the Consul servers when executing the prepared query. Agents prior
|
2016-08-16 15:10:52 -07:00
|
|
|
|
to 0.7 do not send this information, which means they are unable to properly
|
2016-07-01 12:26:14 -07:00
|
|
|
|
execute prepared queries configured with a `Near` parameter. Similarly, any
|
2016-08-16 15:10:52 -07:00
|
|
|
|
server nodes prior to version 0.7 are unable to store the `Near` parameter,
|
2016-07-01 12:26:14 -07:00
|
|
|
|
making them unable to properly serve requests for prepared queries using the
|
2016-08-16 15:10:52 -07:00
|
|
|
|
feature. It is recommended that all agents be running version 0.7 prior to
|
2016-07-01 12:26:14 -07:00
|
|
|
|
using this feature.
|
|
|
|
|
|
2016-08-15 15:47:15 -07:00
|
|
|
|
#### WAN Address Translation in HTTP Endpoints
|
|
|
|
|
|
|
|
|
|
Consul version 0.7 added support for translating WAN addresses in certain
|
2020-04-09 19:46:54 -04:00
|
|
|
|
[HTTP endpoints](/docs/agent/options#translate_wan_addrs). The servers
|
2016-08-16 15:10:52 -07:00
|
|
|
|
and the agents need to be running version 0.7 or later in order to use this
|
2016-08-15 15:47:15 -07:00
|
|
|
|
feature.
|
|
|
|
|
|
2016-09-01 00:22:09 -07:00
|
|
|
|
These translated addresses could break HTTP endpoint consumers that are
|
2020-04-13 18:55:29 -04:00
|
|
|
|
expecting local addresses, so a new [`X-Consul-Translate-Addresses`](/api#translated-addresses)
|
2016-08-16 15:10:52 -07:00
|
|
|
|
header was added to allow clients to detect if translation is enabled for HTTP
|
2016-09-01 00:22:09 -07:00
|
|
|
|
responses. A "lan" tag was added to `TaggedAddresses` for clients that need
|
2016-08-16 15:10:52 -07:00
|
|
|
|
the local address regardless of translation.
|
|
|
|
|
|
2016-09-01 00:22:09 -07:00
|
|
|
|
#### Outage Recovery and `peers.json` Changes
|
2016-08-16 15:10:52 -07:00
|
|
|
|
|
|
|
|
|
The `peers.json` file is no longer present by default and is only used when
|
|
|
|
|
performing recovery. This file will be deleted after Consul starts and ingests
|
2016-09-01 00:22:09 -07:00
|
|
|
|
the file. Consul 0.7 also uses a new, automatically-created raft/peers.info file
|
|
|
|
|
to avoid ingesting the `peers.json` file on the first start after upgrading (the
|
|
|
|
|
`peers.json` file is simply deleted on the first start after upgrading).
|
2016-08-16 15:10:52 -07:00
|
|
|
|
|
2020-09-10 13:32:06 -04:00
|
|
|
|
Please be sure to review the [Outage Recovery tutorial](https://learn.hashicorp.com/tutorials/consul/recovery-outage)
|
2016-08-16 15:10:52 -07:00
|
|
|
|
before upgrading for more details.
|
|
|
|
|
|
2016-02-24 01:33:10 -08:00
|
|
|
|
## Consul 0.6.4
|
|
|
|
|
|
|
|
|
|
Consul 0.6.4 made some substantial changes to how ACLs work with prepared
|
|
|
|
|
queries. Existing queries will execute with no changes, but there are important
|
|
|
|
|
differences to understand about how prepared queries are managed before you
|
|
|
|
|
upgrade. In particular, prepared queries with no `Name` defined will no longer
|
|
|
|
|
require any ACL to manage them, and prepared queries with a `Name` defined are
|
2016-03-04 16:32:53 -08:00
|
|
|
|
now governed by a new `query` ACL policy that will need to be configured
|
2016-02-24 01:33:10 -08:00
|
|
|
|
after the upgrade.
|
|
|
|
|
|
2020-04-09 19:46:54 -04:00
|
|
|
|
See the [ACL rules documentation](/docs/acl/acl-rules#prepared-query-rules) for more details
|
2017-04-04 18:51:47 -07:00
|
|
|
|
about the new behavior and how it compares to previous versions of Consul.
|
2016-02-24 01:33:10 -08:00
|
|
|
|
|
2015-06-11 17:48:08 -07:00
|
|
|
|
## Consul 0.6
|
|
|
|
|
|
2015-10-15 14:42:46 -07:00
|
|
|
|
Consul version 0.6 is a very large release with many enhancements and
|
|
|
|
|
optimizations. Changes to be aware of during an upgrade are categorized below.
|
|
|
|
|
|
2016-08-04 06:39:50 -07:00
|
|
|
|
#### Data Store Changes
|
2015-10-15 14:42:46 -07:00
|
|
|
|
|
|
|
|
|
Consul changed the format used to store data on the server nodes in version 0.5
|
|
|
|
|
(see 0.5.1 notes below for details). Previously, Consul would automatically
|
|
|
|
|
detect data directories using the old LMDB format, and convert them to the newer
|
|
|
|
|
BoltDB format. This automatic upgrade has been removed for Consul 0.6, and
|
|
|
|
|
instead a safeguard has been put in place which will prevent Consul from booting
|
|
|
|
|
if the old directory format is detected.
|
|
|
|
|
|
|
|
|
|
It is still possible to migrate from a 0.5.x version of Consul to 0.6+ using the
|
|
|
|
|
[consul-migrate](https://github.com/hashicorp/consul-migrate) CLI utility. This
|
|
|
|
|
is the same tool that was previously embedded into Consul. See the
|
|
|
|
|
[releases](https://github.com/hashicorp/consul-migrate/releases) page for
|
|
|
|
|
downloadable versions of the tool.
|
|
|
|
|
|
2016-01-08 18:49:31 -08:00
|
|
|
|
Also, in this release Consul switched from LMDB to a fully in-memory database for
|
|
|
|
|
the state store. Because LMDB is a disk-based backing store, it was able to store
|
|
|
|
|
more data than could fit in RAM in some cases (though this is not a recommended
|
|
|
|
|
configuration for Consul). If you have an extremely large data set that won't fit
|
2016-01-08 19:15:42 -08:00
|
|
|
|
into RAM, you may encounter issues upgrading to Consul 0.6.0 and later. Consul
|
|
|
|
|
should be provisioned with physical memory approximately 2X the data set size to
|
|
|
|
|
allow for bursty allocations and subsequent garbage collection.
|
2016-01-08 18:49:31 -08:00
|
|
|
|
|
2015-10-15 14:42:46 -07:00
|
|
|
|
#### ACL Enhancements
|
|
|
|
|
|
2015-06-11 17:48:08 -07:00
|
|
|
|
Consul 0.6 introduces enhancements to the ACL system which may require special
|
|
|
|
|
handling:
|
|
|
|
|
|
2020-04-06 16:27:35 -04:00
|
|
|
|
- Service ACLs are enforced during service discovery (REST + DNS)
|
2015-06-11 17:48:08 -07:00
|
|
|
|
|
|
|
|
|
Previously, service discovery was wide open, and any client could query
|
|
|
|
|
information about any service without providing a token. Consul now requires
|
2015-12-02 10:32:00 -08:00
|
|
|
|
read-level access at a minimum when ACLs are enabled to return service
|
2015-06-11 17:48:08 -07:00
|
|
|
|
information over the REST or DNS interfaces. If clients depend on an open
|
|
|
|
|
service discovery system, then the following should be added to all ACL tokens
|
|
|
|
|
which require it:
|
|
|
|
|
|
|
|
|
|
# Enable discovery of all services
|
|
|
|
|
service "" {
|
|
|
|
|
policy = "read"
|
|
|
|
|
}
|
|
|
|
|
|
2016-11-25 11:00:02 -05:00
|
|
|
|
When the DNS interface is queried, the agent's
|
2020-04-09 19:46:54 -04:00
|
|
|
|
[`acl_token`](/docs/agent/options#acl_token) is used, so be sure
|
2016-11-25 11:00:02 -05:00
|
|
|
|
that token has sufficient privileges to return the DNS records you
|
|
|
|
|
expect to retrieve from it.
|
2015-06-11 17:48:08 -07:00
|
|
|
|
|
2020-04-06 16:27:35 -04:00
|
|
|
|
- Event and keyring ACLs
|
2015-12-02 10:32:00 -08:00
|
|
|
|
|
|
|
|
|
Similar to service discovery, the new event and keyring ACLs will block access
|
|
|
|
|
to these operations if the `acl_default_policy` is set to `deny`. If clients depend
|
|
|
|
|
on open access to these, then the following should be added to all ACL tokens which
|
|
|
|
|
require them:
|
|
|
|
|
|
|
|
|
|
event "" {
|
|
|
|
|
policy = "write"
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
keyring = "write"
|
|
|
|
|
|
|
|
|
|
Unfortunately, these are new ACLs for Consul 0.6, so they must be added after the
|
|
|
|
|
upgrade is complete.
|
|
|
|
|
|
|
|
|
|
#### Prepared Queries
|
|
|
|
|
|
|
|
|
|
Prepared queries introduce a new Raft log entry type that isn't supported on older
|
|
|
|
|
versions of Consul. It's important to not use the prepared query features of Consul
|
|
|
|
|
until all servers in a cluster have been upgraded to version 0.6.0.
|
|
|
|
|
|
2015-12-07 17:58:43 -08:00
|
|
|
|
#### Single Private IP Enforcement
|
|
|
|
|
|
|
|
|
|
Consul will refuse to start if there are multiple private IPs available, so
|
|
|
|
|
if this is the case you will need to configure Consul's advertise or bind addresses
|
|
|
|
|
before upgrading.
|
|
|
|
|
|
2015-12-15 15:17:11 -08:00
|
|
|
|
#### New Web UI File Layout
|
|
|
|
|
|
|
|
|
|
The release .zip file for Consul's web UI no longer contains a `dist` sub-folder;
|
|
|
|
|
everything has been moved up one level. If you have any automated scripts that
|
|
|
|
|
expect the old layout you may need to update them.
|
|
|
|
|
|
2015-04-10 17:52:49 -07:00
|
|
|
|
## Consul 0.5.1
|
|
|
|
|
|
|
|
|
|
Consul version 0.5.1 uses a different backend store for persisting the Raft
|
|
|
|
|
log. Because of this change, a data migration is necessary to move the log
|
|
|
|
|
entries out of LMDB and into the newer backend, BoltDB.
|
|
|
|
|
|
2015-06-02 13:02:35 -05:00
|
|
|
|
Consul version 0.5.1+ makes this transition seamless and easy. As a user, there
|
|
|
|
|
are no special steps you need to take. When Consul starts, it checks
|
2015-04-10 17:52:49 -07:00
|
|
|
|
for presence of the legacy LMDB data files, and migrates them automatically
|
2015-04-10 18:24:09 -07:00
|
|
|
|
if any are found. You will see a log emitted when Raft data is migrated, like
|
|
|
|
|
this:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
==> Successfully migrated raft data in 5.839642ms
|
|
|
|
|
```
|
2015-04-10 17:52:49 -07:00
|
|
|
|
|
2015-06-02 13:02:35 -05:00
|
|
|
|
This automatic upgrade will only exist in Consul 0.5.1+ and it will
|
2017-07-20 14:48:45 -07:00
|
|
|
|
be removed starting with Consul 0.6.0+. It will still be possible to upgrade directly
|
2015-06-02 13:02:35 -05:00
|
|
|
|
from pre-0.5.1 versions by using the consul-migrate utility, which is available on the
|
2020-09-15 13:08:31 -04:00
|
|
|
|
[Consul Tools page](/docs/download-tools).
|
2015-04-10 17:52:49 -07:00
|
|
|
|
|
2015-02-19 11:03:02 -08:00
|
|
|
|
## Consul 0.5
|
|
|
|
|
|
|
|
|
|
Consul version 0.5 adds two features that complicate the upgrade process:
|
|
|
|
|
|
2020-04-06 16:27:35 -04:00
|
|
|
|
- ACL system includes service discovery and registration
|
|
|
|
|
- Internal use of tombstones to fix behavior of blocking queries
|
2015-02-19 11:03:02 -08:00
|
|
|
|
in certain edge cases.
|
|
|
|
|
|
|
|
|
|
Users of the ACL system need to be aware that deploying Consul 0.5 will
|
|
|
|
|
cause service registration to be enforced. This means if an agent
|
|
|
|
|
attempts to register a service without proper privileges it will be denied.
|
|
|
|
|
If the `acl_default_policy` is "allow" then clients will continue to
|
|
|
|
|
work without an updated policy. If the policy is "deny", then all clients
|
|
|
|
|
will begin to have their registration rejected causing issues.
|
|
|
|
|
|
|
|
|
|
To avoid this situation, all the ACL policies should be updated to
|
|
|
|
|
add something like this:
|
|
|
|
|
|
|
|
|
|
# Enable all services to be registered
|
|
|
|
|
service "" {
|
|
|
|
|
policy = "write"
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
This will set the service policy to `write` level for all services.
|
|
|
|
|
The blank service name is the catch-all value. A more specific service
|
|
|
|
|
can also be specified:
|
|
|
|
|
|
|
|
|
|
# Enable only the API service to be registered
|
|
|
|
|
service "api" {
|
|
|
|
|
policy = "write"
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
The ACL policy can be updated while running 0.4, and enforcement will
|
|
|
|
|
being with the upgrade to 0.5. The policy updates will ensure the
|
|
|
|
|
availability of the cluster.
|
|
|
|
|
|
|
|
|
|
The second major change is the new internal command used for tombstones.
|
|
|
|
|
The details of the change are not important, however to function the leader
|
|
|
|
|
node will replicate a new command to its followers. Consul is designed
|
|
|
|
|
defensively, and when a command that is not recognized is received, the
|
|
|
|
|
server will panic. This is a purposeful design decision to avoid the possibility
|
2017-09-27 11:20:01 -07:00
|
|
|
|
of data loss, inconsistencies, or security issues caused by future incompatibility.
|
2015-02-19 11:03:02 -08:00
|
|
|
|
|
|
|
|
|
In practice, this means if a Consul 0.5 node is the leader, all of its
|
|
|
|
|
followers must also be running 0.5. There are a number of ways to do this
|
|
|
|
|
to ensure cluster availability:
|
|
|
|
|
|
2020-04-06 16:27:35 -04:00
|
|
|
|
- Add new 0.5 nodes, then remove the old servers. This will add the new
|
2015-02-19 11:03:02 -08:00
|
|
|
|
nodes as followers, and once the old servers are removed, one of the
|
|
|
|
|
0.5 nodes will become leader.
|
|
|
|
|
|
2020-04-06 16:27:35 -04:00
|
|
|
|
- Upgrade the followers first, then the leader last. Using `consul info`,
|
2015-02-19 11:03:02 -08:00
|
|
|
|
you can determine which nodes are followers. Do an in-place upgrade
|
|
|
|
|
on them first, and finally upgrade the leader last.
|
|
|
|
|
|
2020-04-06 16:27:35 -04:00
|
|
|
|
- Upgrade them in any order, but ensure all are done within 15 minutes.
|
2015-02-19 11:03:02 -08:00
|
|
|
|
Even if the leader is upgraded to 0.5 first, as long as all of the followers
|
|
|
|
|
are running 0.5 within 15 minutes there will be no issues.
|
|
|
|
|
|
|
|
|
|
Finally, even if any of the methods above are not possible or the process
|
|
|
|
|
fails for some reason, it is not fatal. The older version of the server
|
|
|
|
|
will simply panic and stop. At that point, you can upgrade to the new version
|
|
|
|
|
and restart the agent. There will be no data loss and the cluster will
|
|
|
|
|
resume operations.
|