diff --git a/consul/autopilot.go b/consul/autopilot.go index b1736b3b80..9156fd49e0 100644 --- a/consul/autopilot.go +++ b/consul/autopilot.go @@ -48,6 +48,7 @@ func (s *Server) autopilotLoop() { _, autopilotConf, err := state.AutopilotConfig() if err != nil { s.logger.Printf("[ERR] consul: error retrieving autopilot config: %s", err) + break } if err := s.autopilotPolicy.PromoteNonVoters(autopilotConf); err != nil { diff --git a/website/source/docs/agent/options.html.markdown b/website/source/docs/agent/options.html.markdown index 4b85c47be1..bb9e41ba7e 100644 --- a/website/source/docs/agent/options.html.markdown +++ b/website/source/docs/agent/options.html.markdown @@ -558,6 +558,7 @@ Consul will not enable TLS for the HTTP API unless the `https` port has been ass * `autopilot` Added in Consul 0.8, this object allows a number of sub-keys to be set which can configure operator-friendly settings for Consul servers. + For more information about Autopilot, see the [Autopilot Guide](/docs/guides/autopilot.html).

The following sub-keys are available: diff --git a/website/source/docs/guides/autopilot.html.markdown b/website/source/docs/guides/autopilot.html.markdown new file mode 100644 index 0000000000..ce8b7b2c42 --- /dev/null +++ b/website/source/docs/guides/autopilot.html.markdown @@ -0,0 +1,108 @@ +--- +layout: "docs" +page_title: "Autopilot" +sidebar_current: "docs-guides-autopilot" +description: |- + This guide covers how to configure and use Autopilot features. +--- + +# Autopilot + +Autopilot is a set of new features added in Consul 0.8 to allow for automatic +operator-friendly management of Consul servers. It includes cleanup of dead +servers, monitoring the of the Raft cluster, and stable server introduction. + +To enable Autopilot features (with the exception of dead server cleanup), +the [`raft_protocol`](/docs/agent/options.html#_raft_protocol) setting in +the Agent configuration must be set to 3 or higher on all servers. In Consul +0.8 this setting defaults to 2; in Consul 0.9 it will default to 3. + +## Configuration + +The configuration of Autopilot is loaded by the leader from the agent's +[`autopilot`](/docs/agent/options.html#autopilot) settings when initially +bootstrapping the cluster. After bootstrapping, the configuration can +be viewed or modified either via the [`operator autopilot`] +(/docs/commands/operator/autopilot.html) subcommand or the +[`/v1/operator/autopilot/configuration`](/docs/agent/http/operator.html#autopilot-configuration) +HTTP endpoint: + +``` +$ consul operator autopilot get-config +CleanupDeadServers = true +LastContactThreshold = 200ms +MaxTrailingLogs = 250 +ServerStabilizationTime = 10s + +$ consul operator autopilot set-config -cleanup-dead-servers=false +Configuration updated! + +$ consul operator autopilot get-config +CleanupDeadServers = false +LastContactThreshold = 200ms +MaxTrailingLogs = 250 +ServerStabilizationTime = 10s +``` + +## Dead Server Cleanup + +Dead servers will periodically be cleaned up and removed from the Raft peer +set, to prevent them from interfering with the quorum size and leader elections. +This cleanup will also happen whenever a new server is successfully added to the +cluster. + +This option can be disabled by running `consul operator autopilot set-config` +with the `-cleanup-dead-servers=false` option. + +## Server Health Checking + +An internal health check runs on the leader to track the stability of servers. +
A server is considered healthy if: + +- It has a SerfHealth status of 'Alive' +- The time since its last contact with the current leader is below +`LastContactThreshold` +- Its latest Raft term matches the leader's term +- The number of Raft log entries it trails the leader by does not exceed +`MaxTrailingLogs` + +The status of these health checks can be viewed through the [`/v1/operator/autopilot/health`] +(/docs/agent/http/operator.html#autopilot-health) HTTP endpoint, with a top level +`Healthy` field indicating the overall status of the cluster: + +``` +$ curl localhost:8500/v1/operator/autopilot/health +{ + "Healthy": true, + "FailureTolerance": 0, + "Servers": [ + { + "ID": "e349749b-3303-3ddf-959c-b5885a0e1f6e", + "Name": "node1", + "SerfStatus": "alive", + "LastContact": "0s", + "LastTerm": 3, + "LastIndex": 23, + "Healthy": true, + "StableSince": "2017-03-10T22:01:14Z" + }, + { + "ID": "099061c7-ea74-42d5-be04-a0ad74caaaf5", + "Name": "node2", + "SerfStatus": "alive", + "LastContact": "53.279635ms", + "LastTerm": 3, + "LastIndex": 23, + "Healthy": true, + "StableSince": "2017-03-10T22:03:26Z" + } + ] +} +``` + +## Stable Server Introduction + +When a new server is added to the cluster, there is a waiting period where it +must be healthy and stable for a certain amount of time before being promoted +to a full, voting member. This can be configured via the `ServerStabilizationTime` +setting. diff --git a/website/source/docs/upgrade-specific.html.markdown b/website/source/docs/upgrade-specific.html.markdown index 3b15e0441a..987cc5aa84 100644 --- a/website/source/docs/upgrade-specific.html.markdown +++ b/website/source/docs/upgrade-specific.html.markdown @@ -33,6 +33,37 @@ and update any scripts that passed a custom `-rpc-addr` to the following command * `monitor` * `reload` +#### Raft Protocol version compatibility + +When upgrading to Consul 0.8.0 from a version lower than 0.7.0, users will need to +set the [`-raft-protocol`](/docs/agent/options.html#_raft_protocol) option to 1 in +order to maintain backwards compatibility with the old servers during the upgrade. +After the servers have been migrated to version 0.8.0, `-raft-protocol` can be moved +up to 2 and the servers restarted to match the default. + +The Raft protocol must be stepped up in this way; only adjacent version numbers are +compatible (for example, version 1 cannot talk to version 3). Here is a table of the Raft Protocol +versions supported by each Consul version: + + + + + + + + + + + + + + + + + + +
VersionSupported Raft Protocols
0.6 and earlier0
0.71
0.81, 2, 3
+ ## Consul 0.7.1 #### Child Process Reaping diff --git a/website/source/layouts/docs.erb b/website/source/layouts/docs.erb index c1e2f28876..18ca0c2100 100644 --- a/website/source/layouts/docs.erb +++ b/website/source/layouts/docs.erb @@ -296,6 +296,10 @@ Adding/Removing Servers + > + Autopilot + + > Bootstrapping