mirror of https://github.com/status-im/consul.git
website: Guide for adding/removing servers
This commit is contained in:
parent
ab10bfc00f
commit
369f6a3dcf
|
@ -12,6 +12,10 @@ guidance do doing them safely.
|
|||
|
||||
The following guides are available:
|
||||
|
||||
* [Adding/Removing Servers](/docs/guides/servers.html) - This guide covers how to safely add
|
||||
and remove Consul servers from the cluster. This should be done carefully to avoid availability
|
||||
outages.
|
||||
|
||||
* [Bootstrapping](/docs/guides/bootstrapping.html) - This guide covers bootstrapping a new
|
||||
datacenter. This covers safely adding the initial Consul servers.
|
||||
|
||||
|
@ -23,6 +27,4 @@ The following guides are available:
|
|||
* [Multiple Datacenters](/docs/guides/datacenters.html) - Configuring Consul to support multiple
|
||||
datacenters.
|
||||
|
||||
* TODO: Adding and removing servers
|
||||
|
||||
|
||||
|
|
|
@ -0,0 +1,106 @@
|
|||
---
|
||||
layout: "docs"
|
||||
page_title: "Adding/Removing Servers"
|
||||
sidebar_current: "docs-guides-servers"
|
||||
---
|
||||
|
||||
# Adding/Removing Servers
|
||||
|
||||
Consul is designed to require minimal operator involvement, however any changes
|
||||
to the set of Consul servers must be handled carefully. To better understand
|
||||
why, reading about the [consensus protocol](/docs/internals/consensus.html) will
|
||||
be useful. In short, the Consul servers perform leader election and replication.
|
||||
For changes to be processed, a minimum quorum of servers (N/2)+1 must be available.
|
||||
That means if there are 3 server nodes, at least 2 must be available.
|
||||
|
||||
In general, if you are ever adding and removing nodes simultaneously, it is better
|
||||
to first add the new nodes and then remove the old nodes.
|
||||
|
||||
## Adding New Servers
|
||||
|
||||
Adding new servers is generally straightforward. After the initial server, no further
|
||||
servers should ever be started with the `-bootstrap` flag. Instead, simply start the new
|
||||
server with the `-server` flag. At this point, the server will not be a member of
|
||||
any cluster, and should emit something like:
|
||||
|
||||
[WARN] raft: EnableSingleNode disabled, and no known peers. Aborting election.
|
||||
|
||||
This means that it does not know about any peers and is not configured to elect itself.
|
||||
This is expected, and we can now add this node to the existing cluster using `join`.
|
||||
From the new server, we can join any member of the existing cluster:
|
||||
|
||||
$ consul join <Node Address>
|
||||
Successfully joined cluster by contacting 1 nodes.
|
||||
|
||||
It is important to note that any node, including a non-server may be specified for
|
||||
join. The gossip protocol is used to properly discover all the nodes in the cluster.
|
||||
Once the node has joined, the existing cluster leader should log something like:
|
||||
|
||||
[INFO] raft: Added peer 127.0.0.2:8300, starting replication
|
||||
|
||||
This means that raft, the underlying consensus protocol, has added the peer and begun
|
||||
replicating state. Since the existing cluster may be very far ahead, it can take some
|
||||
time for the new node to catch up. To check on this, run `info` on the leader:
|
||||
|
||||
```
|
||||
$ consul info
|
||||
...
|
||||
raft:
|
||||
applied_index = 47244
|
||||
commit_index = 47244
|
||||
fsm_pending = 0
|
||||
last_log_index = 47244
|
||||
last_log_term = 21
|
||||
last_snapshot_index = 40966
|
||||
last_snapshot_term = 20
|
||||
num_peers = 2
|
||||
state = Leader
|
||||
term = 21
|
||||
...
|
||||
```
|
||||
|
||||
This will provide various information about the state of Raft. In particular
|
||||
the `last_log_index` shows the last log that is on disk. The same `info` command
|
||||
can be run on the new server to see how far behind it is. Eventually the server
|
||||
will be caught up, and the values should match.
|
||||
|
||||
It is best to add servers one at a time, allowing them to catchup. This avoids
|
||||
the possibility of data loss in case the existing servers fail while bringing
|
||||
the new servers up-to-date.
|
||||
|
||||
## Removing Servers
|
||||
|
||||
Removing servers must be done carefully to avoid causing an availability outage.
|
||||
For a cluster of N servers, at least (N/2)+1 must be available for the cluster
|
||||
to functions. See this [deployment table](/docs/internals/consensus.html#toc_3).
|
||||
If you have 3 servers, and 1 of them is currently failed, removing any servers
|
||||
will cause the cluster to become unavailable.
|
||||
|
||||
To avoid this, it may be necessary to first add new servers to the cluster,
|
||||
increasing the failure tolerance of the cluster, and then to remove old servers.
|
||||
Even if all 3 nodes are functioning, removing one leaves the cluster in a state
|
||||
that cannot tolerate the failure of any node.
|
||||
|
||||
Once you have verified the existing servers are healthy, and that the cluster
|
||||
can handle a node leaving, the actual process is simple. You simply issue a
|
||||
`leave` command to the server.
|
||||
|
||||
The server leaving should contain logs like:
|
||||
|
||||
...
|
||||
[INFO] consul: server starting leave
|
||||
...
|
||||
[INFO] raft: Removed ourself, transitioning to follower
|
||||
...
|
||||
|
||||
The leader should also emit various logs including:
|
||||
|
||||
...
|
||||
[INFO] consul: member 'node-10-0-1-8' left, deregistering
|
||||
[INFO] raft: Removed peer 10.0.1.8:8300, stopping replication
|
||||
...
|
||||
|
||||
At this point the node has been gracefully removed from the cluster, and
|
||||
will shutdown.
|
||||
|
||||
|
|
@ -120,6 +120,10 @@
|
|||
<li<%= sidebar_current("docs-guides") %>>
|
||||
<a href="/docs/guides/index.html">Guides</a>
|
||||
<ul class="nav">
|
||||
<li<%= sidebar_current("docs-guides-servers") %>>
|
||||
<a href="/docs/guides/servers.html">Adding/Removing Servers</a>
|
||||
</li>
|
||||
|
||||
<li<%= sidebar_current("docs-guides-bootstrapping") %>>
|
||||
<a href="/docs/guides/bootstrapping.html">Bootstrapping</a>
|
||||
</li>
|
||||
|
|
Loading…
Reference in New Issue