mirror of https://github.com/status-im/consul.git
484 lines
18 KiB
Plaintext
484 lines
18 KiB
Plaintext
---
|
|
layout: api
|
|
page_title: Autopilot - Operator - HTTP API
|
|
description: |-
|
|
The /operator/autopilot endpoints allow for automatic operator-friendly
|
|
management of Consul servers including cleanup of dead servers, monitoring
|
|
the state of the Raft cluster, and stable server introduction.
|
|
---
|
|
|
|
# Autopilot Operator HTTP API
|
|
|
|
The `/operator/autopilot` endpoints allow for automatic operator-friendly
|
|
management of Consul servers including cleanup of dead servers, monitoring
|
|
the state of the Raft cluster, and stable server introduction.
|
|
|
|
Please check the [Autopilot tutorial](https://learn.hashicorp.com/tutorials/consul/autopilot-datacenter-operations) for more details.
|
|
|
|
## Read Configuration
|
|
|
|
This endpoint retrieves its latest Autopilot configuration.
|
|
|
|
| Method | Path | Produces |
|
|
| ------ | ----------------------------------- | ------------------ |
|
|
| `GET` | `/operator/autopilot/configuration` | `application/json` |
|
|
|
|
The table below shows this endpoint's support for
|
|
[blocking queries](/api/features/blocking),
|
|
[consistency modes](/api/features/consistency),
|
|
[agent caching](/api/features/caching), and
|
|
[required ACLs](/api#authentication).
|
|
|
|
| Blocking Queries | Consistency Modes | Agent Caching | ACL Required |
|
|
| ---------------- | ----------------- | ------------- | --------------- |
|
|
| `NO` | `none` | `none` | `operator:read` |
|
|
|
|
### Parameters
|
|
|
|
- `dc` `(string: "")` - Specifies the datacenter to query. This will default to
|
|
the datacenter of the agent being queried. This is specified as part of the
|
|
URL as a query string.
|
|
|
|
- `stale` `(bool: false)` - If the cluster does not currently have a leader an
|
|
error will be returned. You can use the `?stale` query parameter to read the
|
|
Raft configuration from any of the Consul servers.
|
|
|
|
### Sample Request
|
|
|
|
```shell-session
|
|
$ curl \
|
|
http://127.0.0.1:8500/v1/operator/autopilot/configuration
|
|
```
|
|
|
|
### Sample Response
|
|
|
|
```json
|
|
{
|
|
"CleanupDeadServers": true,
|
|
"LastContactThreshold": "200ms",
|
|
"MaxTrailingLogs": 250,
|
|
"ServerStabilizationTime": "10s",
|
|
"RedundancyZoneTag": "",
|
|
"DisableUpgradeMigration": false,
|
|
"UpgradeVersionTag": "",
|
|
"CreateIndex": 4,
|
|
"ModifyIndex": 4
|
|
}
|
|
```
|
|
|
|
For more information about the Autopilot configuration options, see the
|
|
[agent configuration section](/docs/agent/options#autopilot).
|
|
|
|
## Update Configuration
|
|
|
|
This endpoint updates the Autopilot configuration of the cluster.
|
|
|
|
| Method | Path | Produces |
|
|
| ------ | ----------------------------------- | ------------------ |
|
|
| `PUT` | `/operator/autopilot/configuration` | `application/json` |
|
|
|
|
The table below shows this endpoint's support for
|
|
[blocking queries](/api/features/blocking),
|
|
[consistency modes](/api/features/consistency),
|
|
[agent caching](/api/features/caching), and
|
|
[required ACLs](/api#authentication).
|
|
|
|
| Blocking Queries | Consistency Modes | Agent Caching | ACL Required |
|
|
| ---------------- | ----------------- | ------------- | ---------------- |
|
|
| `NO` | `none` | `none` | `operator:write` |
|
|
|
|
### Parameters
|
|
|
|
- `dc` `(string: "")` - Specifies the datacenter to query. This will default to
|
|
the datacenter of the agent being queried. This is specified as part of the
|
|
URL as a query string.
|
|
|
|
- `cas` `(int: 0)` - Specifies to use a Check-And-Set operation. The update will
|
|
only happen if the given index matches the `ModifyIndex` of the configuration
|
|
at the time of writing.
|
|
|
|
- `CleanupDeadServers` `(bool: true)` - Specifies automatic removal of dead
|
|
server nodes periodically and whenever a new server is added to the cluster.
|
|
|
|
- `LastContactThreshold` `(string: "200ms")` - Specifies the maximum amount of
|
|
time a server can go without contact from the leader before being considered
|
|
unhealthy. Must be a duration value such as `10s`.
|
|
|
|
- `MaxTrailingLogs` `(int: 250)` specifies the maximum number of log entries
|
|
that a server can trail the leader by before being considered unhealthy.
|
|
|
|
- `MinQuorum` `(int: 0)` - specifies the minimum number of servers needed before
|
|
Autopilot can prune dead servers.
|
|
|
|
- `ServerStabilizationTime` `(string: "10s")` - Specifies the minimum amount of
|
|
time a server must be stable in the 'healthy' state before being added to the
|
|
cluster. Only takes effect if all servers are running Raft protocol version 3
|
|
or higher. Must be a duration value such as `30s`.
|
|
|
|
- `RedundancyZoneTag` `(string: "")` - Controls the node-meta key to use when
|
|
Autopilot is separating servers into zones for redundancy. Only one server in
|
|
each zone can be a voting member at one time. If left blank, this feature will
|
|
be disabled.
|
|
|
|
- `DisableUpgradeMigration` `(bool: false)` - Disables Autopilot's upgrade
|
|
migration strategy in Consul Enterprise of waiting until enough
|
|
newer-versioned servers have been added to the cluster before promoting any of
|
|
them to voters.
|
|
|
|
- `UpgradeVersionTag` `(string: "")` - Controls the node-meta key to use for
|
|
version info when performing upgrade migrations. If left blank, the Consul
|
|
version will be used.
|
|
|
|
### Sample Payload
|
|
|
|
```json
|
|
{
|
|
"CleanupDeadServers": true,
|
|
"LastContactThreshold": "200ms",
|
|
"MaxTrailingLogs": 250,
|
|
"MinQuorum": 3,
|
|
"ServerStabilizationTime": "10s",
|
|
"RedundancyZoneTag": "",
|
|
"DisableUpgradeMigration": false,
|
|
"UpgradeVersionTag": "",
|
|
"CreateIndex": 4,
|
|
"ModifyIndex": 4
|
|
}
|
|
```
|
|
|
|
## Read Health
|
|
|
|
This endpoint queries the health of the autopilot status.
|
|
|
|
| Method | Path | Produces |
|
|
| ------ | ---------------------------- | ------------------ |
|
|
| `GET` | `/operator/autopilot/health` | `application/json` |
|
|
|
|
The table below shows this endpoint's support for
|
|
[blocking queries](/api/features/blocking),
|
|
[consistency modes](/api/features/consistency),
|
|
[agent caching](/api/features/caching), and
|
|
[required ACLs](/api#authentication).
|
|
|
|
| Blocking Queries | Consistency Modes | Agent Caching | ACL Required |
|
|
| ---------------- | ----------------- | ------------- | --------------- |
|
|
| `NO` | `none` | `none` | `operator:read` |
|
|
|
|
### Parameters
|
|
|
|
- `dc` `(string: "")` - Specifies the datacenter to query. This will default to
|
|
the datacenter of the agent being queried. This is specified as part of the
|
|
URL as a query string.
|
|
|
|
### Sample Request
|
|
|
|
```shell-session
|
|
$ curl \
|
|
http://127.0.0.1:8500/v1/operator/autopilot/health
|
|
```
|
|
|
|
### Sample response
|
|
|
|
```json
|
|
{
|
|
"Healthy": true,
|
|
"FailureTolerance": 0,
|
|
"Servers": [
|
|
{
|
|
"ID": "e349749b-3303-3ddf-959c-b5885a0e1f6e",
|
|
"Name": "node1",
|
|
"Address": "127.0.0.1:8300",
|
|
"SerfStatus": "alive",
|
|
"Version": "0.7.4",
|
|
"Leader": true,
|
|
"LastContact": "0s",
|
|
"LastTerm": 2,
|
|
"LastIndex": 46,
|
|
"Healthy": true,
|
|
"Voter": true,
|
|
"StableSince": "2017-03-06T22:07:51Z"
|
|
},
|
|
{
|
|
"ID": "e36ee410-cc3c-0a0c-c724-63817ab30303",
|
|
"Name": "node2",
|
|
"Address": "127.0.0.1:8205",
|
|
"SerfStatus": "alive",
|
|
"Version": "0.7.4",
|
|
"Leader": false,
|
|
"LastContact": "27.291304ms",
|
|
"LastTerm": 2,
|
|
"LastIndex": 46,
|
|
"Healthy": true,
|
|
"Voter": false,
|
|
"StableSince": "2017-03-06T22:18:26Z"
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
- `Healthy` is whether all the servers are currently healthy.
|
|
|
|
- `FailureTolerance` is the number of redundant healthy servers that could be
|
|
fail without causing an outage (this would be 2 in a healthy cluster of 5
|
|
servers).
|
|
|
|
- `Servers` holds detailed health information on each server:
|
|
|
|
- `ID` is the Raft ID of the server.
|
|
|
|
- `Name` is the node name of the server.
|
|
|
|
- `Address` is the address of the server.
|
|
|
|
- `SerfStatus` is the SerfHealth check status for the server.
|
|
|
|
- `Version` is the Consul version of the server.
|
|
|
|
- `Leader` is whether this server is currently the leader.
|
|
|
|
- `LastContact` is the time elapsed since this server's last contact with the leader.
|
|
|
|
- `LastTerm` is the server's last known Raft leader term.
|
|
|
|
- `LastIndex` is the index of the server's last committed Raft log entry.
|
|
|
|
- `Healthy` is whether the server is healthy according to the current Autopilot configuration.
|
|
|
|
- `Voter` is whether the server is a voting member of the Raft cluster.
|
|
|
|
- `StableSince` is the time this server has been in its current `Healthy` state.
|
|
|
|
The HTTP status code will indicate the health of the cluster. If `Healthy` is true, then a
|
|
status of 200 will be returned. If `Healthy` is false, then a status of 429 will be returned.
|
|
|
|
## Read the Autopilot State
|
|
|
|
This endpoint queries the health of the autopilot status.
|
|
|
|
| Method | Path | Produces |
|
|
| ------ | --------------------------- | ------------------ |
|
|
| `GET` | `/operator/autopilot/state` | `application/json` |
|
|
|
|
The table below shows this endpoint's support for
|
|
[blocking queries](/api/features/blocking),
|
|
[consistency modes](/api/features/consistency),
|
|
[agent caching](/api/features/caching), and
|
|
[required ACLs](/api#authentication).
|
|
|
|
| Blocking Queries | Consistency Modes | Agent Caching | ACL Required |
|
|
| ---------------- | ----------------- | ------------- | --------------- |
|
|
| `NO` | `none` | `none` | `operator:read` |
|
|
|
|
### Parameters
|
|
|
|
- `dc` `(string: "")` - Specifies the datacenter to query. This will default to
|
|
the datacenter of the agent being queried. This is specified as part of the
|
|
URL as a query string.
|
|
|
|
### Sample Request
|
|
|
|
```shell-session
|
|
$ curl \
|
|
http://127.0.0.1:8500/v1/operator/autopilot/state
|
|
```
|
|
|
|
### Response Format
|
|
|
|
```json
|
|
{
|
|
"Healthy": true,
|
|
"FailureTolerance": 1,
|
|
"OptimisticFailureTolerance": 4,
|
|
"Servers": {
|
|
"5e26a3af-f4fc-4104-a8bb-4da9f19cb278": {},
|
|
"10b71f14-4b08-4ae5-840c-f86d39e7d330": {},
|
|
"1fd52e5e-2f72-47d3-8cfc-2af760a0c8c2": {},
|
|
"63783741-abd7-48a9-895a-33d01bf7cb30": {},
|
|
"6cf04fd0-7582-474f-b408-a830b5471285": {}
|
|
},
|
|
"Leader": "5e26a3af-f4fc-4104-a8bb-4da9f19cb278",
|
|
"Voters": [
|
|
"5e26a3af-f4fc-4104-a8bb-4da9f19cb278",
|
|
"10b71f14-4b08-4ae5-840c-f86d39e7d330",
|
|
"1fd52e5e-2f72-47d3-8cfc-2af760a0c8c2"
|
|
],
|
|
"RedundancyZones": {
|
|
"az1": {},
|
|
"az2": {},
|
|
"az3": {}
|
|
},
|
|
"ReadReplicas": [
|
|
"63783741-abd7-48a9-895a-33d01bf7cb30",
|
|
"6cf04fd0-7582-474f-b408-a830b5471285"
|
|
],
|
|
"Upgrade": {}
|
|
}
|
|
```
|
|
|
|
- `Healthy` is whether all the servers are currently healthy.
|
|
|
|
- `FailureTolerance` is the number of redundant healthy servers that could be
|
|
fail without causing an outage (this would be 2 in a healthy cluster of 5
|
|
servers).
|
|
- `OptimisticFailuretolerance` <EnterpriseAlert inline /> is the maximum number
|
|
of servers that could fail in the right order over the right period of time
|
|
without causing an outage. This value is only useful when using the [Redundancy
|
|
Zones feature](/docs/enterprise/redundancy) with autopilot.
|
|
|
|
- `Servers` is a mapping of server ID to an object holding detailed information about that server.
|
|
The format of the detailed info is documented in its own section.
|
|
|
|
- `Leader` is the server ID of current leader. This value can be used as an index into the `Servers` object.
|
|
|
|
- `Voters` is a list of server IDs that are voters. These values can be used as indexes into the `Servers` object.
|
|
|
|
- `RedundancyZones` <EnterpriseAlert inline /> is mapping of redundancy zone name to redundancy zone information.
|
|
The format of the redundancy zone information is documented in its own section.
|
|
|
|
- `ReadReplicas` <EnterpriseAlert inline /> is a list of server IDs that autopilot has identified as read replicas.
|
|
These will never be promoted. These values can be used as indexes into the `Servers` map.
|
|
- `Upgrade` <EnterpriseAlert inline /> is an object holding all the information about any ongoing automated upgrade.
|
|
The format of this object is detailed in its own section.
|
|
|
|
### Server Response Format
|
|
|
|
```json
|
|
{
|
|
"ID": "1c3e3278-3f88-4a97-9f6a-1058584e8058",
|
|
"Name": "node1",
|
|
"Address": "198.18.0.1:8300",
|
|
"NodeStatus": "alive",
|
|
"Version": "1.9.0+ent",
|
|
"LastContact": "1.321ms",
|
|
"LastTerm": 4,
|
|
"LastIndex": 42,
|
|
"Healthy": true,
|
|
"StableSince": "2020-08-12T12:13:14Z",
|
|
"RedundancyZone": "az1",
|
|
"UpgradeVersion": "1.2.3",
|
|
"ReadReplica": false,
|
|
"Status": "voter",
|
|
"Meta": {
|
|
"build": "1.2.3",
|
|
"zone": "az1"
|
|
},
|
|
"NodeType": "redundancy-zone-voter"
|
|
}
|
|
```
|
|
|
|
- `ID` is the Raft ID of the server.
|
|
|
|
- `Name` is the node name of the server.
|
|
|
|
- `Address` is the address of the server.
|
|
|
|
- `NodeStatus` is the SerfHealth check status for the server.
|
|
|
|
- `Version` is the Consul version of the server.
|
|
|
|
- `LastContact` is the time elapsed since this server's last contact with the leader.
|
|
|
|
- `LastTerm` is the server's last known Raft leader term.
|
|
|
|
- `LastIndex` is the index of the server's last committed Raft log entry.
|
|
|
|
- `Healthy` is whether the server is healthy according to the current Autopilot configuration.
|
|
|
|
- `StableSince` is the time this server has been in its current `Healthy` state.
|
|
- `RedundancyZone` <EnterpriseAlert inline /> is the name of the redundancy zone this server is within.
|
|
- `UpgradeVersion` <EnterpriseAlert inline /> is the version that will be used for automated upgrade calculations.
|
|
- `ReadReplica` <EnterpriseAlert inline /> indicates whether this server is a read replica or not.
|
|
- `Status` indicates the current Raft status of this server. Possible values are:
|
|
`leader`, `voter`, `non-voter`, or `staging`.
|
|
- `Meta` is the node metadata of this server. Values within this map are used for determining a server's
|
|
redundancy zone and upgrade version.
|
|
- `NodeType` is the desired type autopilot thinks this server should have. In Consul OSS the only possible
|
|
value is `voter` as all present servers should having voting rights. In Consul Enterprise the possible values also
|
|
include `read-replica`, `zone-voter`, `zone-standby` and `zone-extra-voter`. `zone-voter` indicates that autopilot
|
|
wants this server to be the voter for a particular redundancy zone. When a zone has no voter all nodes will be typed
|
|
as this until one is promoted. When that happens the other non-voters in the zone will be typed as `zone-standby`.
|
|
This indicates that they are currently desired to be standby servers in case the voter from the zone fails. Finally,
|
|
the `zone-extra-voter` status indicates that autopilot wants this server to be a voter due to a failure of all servers
|
|
in another zone and that when one of the servers in that failed zone are restored, this server will be demoted.
|
|
|
|
### Redundancy Zone Response Format <EnterpriseAlert inline />
|
|
|
|
```json
|
|
{
|
|
"Servers": [
|
|
"10b71f14-4b08-4ae5-840c-f86d39e7d330",
|
|
"b007061c-6d15-4c90-b3d6-2fef276a0650"
|
|
],
|
|
"Voters": ["b007061c-6d15-4c90-b3d6-2fef276a0650"],
|
|
"FailureTolerance": 1
|
|
}
|
|
```
|
|
|
|
Each zone in the responses `RedundancyZones` mapping will have this structure.
|
|
|
|
- `Servers` is a list of server IDs of all the servers in this zone. These values can be used as indexes
|
|
into the top level response's `Servers` mapping.
|
|
- `Voters` is a list of server IDs of all servers in this zone that have voting rights. Typically this will
|
|
be a list with 1 value but in some failure scenarios or upgrade scenarios the size could increase. These
|
|
values can be used as indexes into the top level response's `Servers` mapping.
|
|
|
|
- `FailureTolerance` is the number of servers in this zone that could fail without causing a total zone failure
|
|
and subsequent promotion of a server from another zone as a fallback.
|
|
|
|
### Upgrade Information Response Format <EnterpriseAlert inline />
|
|
|
|
```json
|
|
{
|
|
"Status": "awaiting-new-servers",
|
|
"TargetVersion": "1.9.1+ent",
|
|
"TargetVersionVoters": ["f0344689-3e1f-4125-b55d-e888d3abf514"],
|
|
"TargetVersionNonVoters": [
|
|
"619a4ba6-1a0b-476e-8a1a-28aeee7735a2",
|
|
"fd683fe6-541f-4ebf-bc5a-6eae51571ddb"
|
|
],
|
|
"TargetVersionReadReplicas": ["9f1e27ae-1129-45ef-97dd-6d8c3ec47e6a"],
|
|
"OtherVersionVoters": [
|
|
"0cbdd493-235f-48f2-98d9-1bf2443b9d72",
|
|
"21812bd7-2f21-4565-9892-2fdd3d4e1a99",
|
|
"c654ba5c-cc76-4056-a5ca-6e78d95f27ad"
|
|
],
|
|
"OtherVersionNonVoters": [
|
|
"6d973f11-6bdb-4f7d-8a90-c1300066da4c",
|
|
"6241ab45-371e-4b2a-a0f1-d847c3b7b1b0"
|
|
],
|
|
"OtherVersionReadReplicas": ["42d10fc3-581b-4403-832d-945b3a0d8841"]
|
|
}
|
|
```
|
|
|
|
- `Status` is the automated upgrade status. Possible values are:
|
|
|
|
- `disabled` indicates that automated upgrades are disabled either from user configuration or due to being unlicensed.
|
|
|
|
- `idle` indicates that there is no ongoing upgrade and that all servers are running the same Consul version.
|
|
|
|
- `await-new-voters` indicates that a newer versioned server has been added but that autopilot is waiting for more servers
|
|
of that version to be added before proceeding with the upgrade.
|
|
- `promoting` indicates that enough servers of the target version have been added and autopilot will now promote them
|
|
to voters.
|
|
- `demoting` indicates that autopilot is currently demoting the servers not running the target version.
|
|
|
|
- `leader-transfer` indicates that autopilot is in the process of transferring leadership to a server running
|
|
the target version.
|
|
- `await-new-servers` indicates that the majority of the upgrade is complete but that more servers running the target
|
|
version need to be added to completely replace all of the previous servers.
|
|
- `await-server-removal` indicates that the upgrade is complete and it is now safe to remove the previous servers.
|
|
|
|
- `TargetVersion` is the version that Autopilot is upgrading to. This will be the maximum version of all servers
|
|
`UpgradeVersion` field in the top level `Servers` mapping.
|
|
- `TargetVersionVoters` is a list of IDs of servers running the target version and that currently have voting rights.
|
|
|
|
- `TargetVersionNonVoters` is a list of IDs of servers running the target version and that currently do not have voting rights.
|
|
|
|
- `TargetVersionReadReplicas` is a list of IDs of servers running the target version and are read replicas.
|
|
|
|
- `OtherVersionVoters` is a list of IDs of servers not running the target version and that currently have voting rights.
|
|
|
|
- `OtherVersionNonVoters` is a list of IDs of servers not running the target version and that currently do not have voting rights.
|
|
|
|
- `OtherVersionReadReplicas` is a list of IDs of servers not running the target version and are read replicas.
|