From 2eaece10a73d24ab2a6f017238dfec7256fbdfa9 Mon Sep 17 00:00:00 2001 From: Freddy Date: Fri, 27 Sep 2019 17:49:28 -0600 Subject: [PATCH] Update Force Leave docs (#6550) Fixes #2742 Previously the docs didn't clarify that if a server restarts as a client then force-leave won't lead to removing the node from the raft config. This is because the node, which is alive after a restart, will refute messages about it having left . These messages about members leaving are in turn what trigger Consul's leader to remove a server from raft. --- .../commands/force-leave.html.markdown.erb | 41 +++++++++++++++---- 1 file changed, 32 insertions(+), 9 deletions(-) diff --git a/website/source/docs/commands/force-leave.html.markdown.erb b/website/source/docs/commands/force-leave.html.markdown.erb index b5d48a6fd9..941c30fa1b 100644 --- a/website/source/docs/commands/force-leave.html.markdown.erb +++ b/website/source/docs/commands/force-leave.html.markdown.erb @@ -11,19 +11,23 @@ description: |- Command: `consul force-leave` The `force-leave` command forces a member of a Consul cluster to enter the -"left" state. If the member is still actually alive, it will -eventually rejoin the cluster. The true purpose of this method is to force -remove "failed" nodes. +"left" state. The purpose of this method is to force-remove a node that has failed or +was shutdown without a [graceful leave](https://www.consul.io/docs/commands/leave.html). -Consul periodically tries to reconnect to "failed" nodes in case it is a -network partition. After some configured amount of time (by default 72 hours), +Consul periodically tries to reconnect to "failed" nodes in case failure was due +to a network partition. After some configured amount of time (by default 72 hours), Consul will reap "failed" nodes and stop trying to reconnect. The `force-leave` -command can be used to transition the "failed" nodes to "left" nodes more -quickly. +command can be used to transition the "failed" nodes to a "left" state more +quickly, as reported by [`consul memebers`](https://www.consul.io/docs/commands/members.html). This can be particularly useful for a node that was running as a server, -as it will be removed from the Raft quorum. Note that `force-leave` cannot be -used to force removal of nodes that are outside of the datacenter. +as it will eventually be removed from the Raft configuration by the leader. + +~> Note that for `force-leave` to take full effect the target node's agent must have +shutdown permanently. If the agent is alive and reachable then it will not be removed +from the datacenter's member list nor from the raft configuration. Additionally, +if the agent returns after transitioning to the "left" state, but before it is reaped +from the member list, then it will rejoin the cluster. ## Usage @@ -33,3 +37,22 @@ Usage: `consul force-leave [options] node` <%= partial "docs/commands/http_api_options_client" %> +## Examples + +Remove a node named `ec2-001-staging` from the local agent's datacenter: + +``` +consul force-leave ec2-001-staging +``` + +When run on a server that is part of a +[WAN gossip pool](https://learn.hashicorp.com/consul/security-networking/datacenters), +`force-leave` can remove failed servers in other datacenters from the WAN pool. + +The identifying node-name in a WAN pool is `[node-name].[datacenter]`. +Therefore, to remove a failed server node named `server1` from +datacenter `us-east1`, run: + +``` +consul force-leave server1.us-east1 +```