1118 Commits

Author SHA1 Message Date
Seth Vargo
4179aacf11
Add an API method for determining the best status
Given a list of HealthChecks, this determines the "best" status for the
collective group. This is useful for nodes and services, which may have
multiple checks associated with them.
2016-11-29 18:41:46 -05:00
Kyle Havlovitz
dd3368c19e Add keyring http endpoints 2016-11-22 20:10:43 -05:00
Kyle Havlovitz
8c157e0acd Retry with backoff on session invalidation failure (#2475) 2016-11-04 21:53:22 -07:00
James Phillips
2d4fd24eaf Moves the snapshot package up one level. (#2472) 2016-11-03 21:36:25 -07:00
Kyle Havlovitz
1b204eb88d Disallow -bootstrap-expect flag in dev mode (#2464) 2016-11-03 01:54:43 -04:00
Kyle Havlovitz
606662c502 Add snapshot inspect subcommand (#2451) 2016-10-31 19:37:27 -04:00
Kyle Havlovitz
3be132863f Enable snapshots in dev mode (#2453) 2016-10-31 14:39:47 -04:00
James Phillips
c01a3871c9 Adds support for snapshots and restores. (#2396)
* Updates Raft library to get new snapshot/restore API.

* Basic backup and restore working, but need some cleanup.

* Breaks out a snapshot module and adds a SHA256 integrity check.

* Adds snapshot ACL and fills in some missing comments.

* Require a consistent read for snapshots.

* Make sure snapshot works if ACLs aren't enabled.

* Adds a bit of package documentation.

* Returns an empty response from restore to avoid EOF errors.

* Adds API client support for snapshots.

* Makes internal file names match on-disk file snapshots.

* Adds DC and token coverage for snapshot API test.

* Adds missing documentation.

* Adds a unit test for the snapshot client endpoint.

* Moves the connection pool out of the client for easier testing.

* Fixes an incidental issue in the prepared query unit test.

I realized I had two servers in bootstrap mode so this wasn't a good setup.

* Adds a half close to the TCP stream and fixes panic on error.

* Adds client and endpoint tests for snapshots.

* Moves the pool back into the snapshot RPC client.

* Adds a TLS test and fixes half-closes for TLS connections.

* Tweaks some comments.

* Adds a low-level snapshot test.

This is independent of Consul so we can pull this out into a library
later if we want to.

* Cleans up snapshot and archive and completes archive tests.

* Sends a clear error for snapshot operations in dev mode.

Snapshots require the Raft snapshots to be readable, which isn't supported
in dev mode. Send a clear error instead of a deep-down Raft one.

* Adds docs for the snapshot endpoint.

* Adds a stale mode and index feedback for snapshot saves.

This gives folks a way to extract data even if the cluster has no
leader.

* Changes the internal format of a snapshot from zip to tgz.

* Pulls in Raft fix to cancel inflight before a restore.

* Pulls in new Raft restore interface.

* Adds metadata to snapshot saves and a verify function.

* Adds basic save and restore snapshot CLI commands.

* Gets rid of tarball extensions and adds restore message.

* Fixes an incidental bad link in the KV docs.

* Adds documentation for the snapshot CLI commands.

* Scuttle any request body when a snapshot is saved.

* Fixes archive unit test error message check.

* Allows for nil output writers in snapshot RPC handlers.

* Renames hash list Decode to DecodeAndVerify.

* Closes the client connection for snapshot ops.

* Lowers timeout for restore ops.

* Updates Raft vendor to get new Restore signature and integrates with Consul.

* Bounces the leader's internal state when we do a restore.
2016-10-25 19:20:24 -07:00
Kyle Havlovitz
ae99f76763 Wait for agent joins to finish in TestClient_RPC 2016-10-25 17:48:11 -07:00
Kyle Havlovitz
e1d850c081 Add wait logic to TestClient_RPC_Pool 2016-10-25 17:48:11 -07:00
James Phillips
9c65bfa768
Fixes port numbers in peers.info. 2016-10-05 18:09:15 -07:00
James Phillips
1488af4277 Merge pull request #2319 from hashicorp/f-bootstrap-abort
Adds check that aborts bootstrap mode if there's an existing cluster.
2016-09-01 09:49:03 -07:00
James Phillips
40e1553cfc
Fixes error message in test. 2016-09-01 09:48:08 -07:00
James Phillips
4d05d692fc
Makes port selection atomic in unit tests. 2016-09-01 01:01:28 -07:00
James Phillips
79aec1b34b
Tweaks comment to be more correct. 2016-08-31 23:54:53 -07:00
James Phillips
c8b184cfd2
Adds check that aborts bootstrap mode if there's an existing cluster. 2016-08-31 21:25:56 -07:00
James Phillips
cda2bd29a9
Copies the member data instead of referencing by pointer. 2016-08-30 16:54:21 -07:00
James Phillips
3c9188c38b
Makes the Raft configuration API easier to consume. 2016-08-30 11:30:56 -07:00
James Phillips
3f16142b40
Adds a log warning when operator peer changes occur. 2016-08-30 10:23:32 -07:00
James Phillips
e5850d8a26
Adds new consul operator endpoint, CLI, and ACL and some basic Raft commands. 2016-08-30 00:02:50 -07:00
James Phillips
53149bd2f9
Makes empty checkServiceNode return a nil.
The change in #2308 had an inadvertent interface change, so we fix that with
a special case in this fix.
2016-08-29 19:12:07 -07:00
James Phillips
3a855d362f
Preallocates result struct, which was a profiling hot spot. 2016-08-26 16:34:28 -07:00
James Phillips
80d1d88590
Removes leader_lease_timeout from stats. 2016-08-25 15:39:19 -07:00
James Phillips
17b70c7efd
Adds a max raft multiplier and tweaks documentation. 2016-08-25 15:36:05 -07:00
James Phillips
2822334bce
Stops scaling the commit timeout. 2016-08-25 15:05:40 -07:00
James Phillips
679b3c0c6b
Increases RPC hold timeout for new default timing.
Rather than scale this we just bump it up a bit. It'll be on the edge in
the lower-performance default mode, and will have plenty of margin in the
high-performance mode. This seems like a reasonable compromise to keep the
logic here simple vs. scaling, and seems inline with the expectations of
the different modes of operation.
2016-08-24 23:35:28 -07:00
James Phillips
57db4bcce6
Adds performance tuning capability for Raft, detuned defaults, and supplemental docs. 2016-08-24 21:58:37 -07:00
James Phillips
defa2a6180 Merge pull request #2226 from abhinavdahiya/rm-health-unknown
Fixes #1775; Removes 'unknown' state
2016-08-17 17:51:04 -07:00
James Phillips
f7eaa06616
Makes the filled-in parts of ServiceNode more explicit. 2016-08-12 18:25:36 -07:00
David van Geest
cdeff022dd
Translate Address to tagged WAN address in HTTP API when appropriate. 2016-08-12 18:25:36 -07:00
James Phillips
ee999f4dce
Removes upper end of muxado handler. 2016-08-09 18:16:41 -07:00
James Phillips
406efb5d91
Closes the conn on bad protocol version. 2016-08-09 18:13:53 -07:00
James Phillips
a984a6703c
Removes support for muxado and protocol version 1. 2016-08-09 18:10:04 -07:00
James Phillips
ce141896df
Updates hashicorp/hcl and hashicorp/hil.
This required a small mod to core Consul code to cope with an interface
change.
2016-08-09 17:24:13 -07:00
James Phillips
66dcefcc4a Merge pull request #2222 from hashicorp/f-raft-v2
Integrates Consul with "stage one" of HashiCorp Raft library v2.
2016-08-09 16:04:48 -07:00
James Phillips
5586ca3ce1
Moves the peers.info content down into a constant. 2016-08-09 11:56:39 -07:00
James Phillips
052cbe3a7d
Adds peers back into bootstrap log, makes initial case consistent. 2016-08-09 11:52:41 -07:00
James Phillips
12ad26e0fc
Tweaks select style. 2016-08-09 11:33:42 -07:00
James Phillips
e07298594e
Adds I/O-sensitive metrics to ACL replication operations. 2016-08-09 11:32:12 -07:00
James Phillips
11ad551204
Switches to a smooth rate limit vs. a bursty one. 2016-08-09 11:29:12 -07:00
James Phillips
5efd35c590
Clarifies replication index shown in the log message. 2016-08-09 11:10:32 -07:00
James Phillips
a771b34de6
Returns from the shutdown wait right away. 2016-08-09 11:09:48 -07:00
James Phillips
17537a0f10
Moves ACL ID sorting interface onto the iterator. 2016-08-09 11:08:26 -07:00
James Phillips
ae1cd5b47d
Switches all ACL caches to 2Q. 2016-08-09 11:00:22 -07:00
James Phillips
04fc5c8a45
Moves ACL ID generation down into the endpoint.
We don't want ACL replication to have this behavior so it was a
little dangerous to have in the shared helper function.
2016-08-09 00:11:00 -07:00
James Phillips
ba1deb5ae9
Removes unsafe "recover to empty" code.
This isn't safe because it would implicitly commit all outstanding log
entries. The new Raft library already has logic to not start a vote if
the current node isn't in the configuration, so this shoudn't be needed.
2016-08-08 19:19:19 -07:00
James Phillips
379eb5ecd0
Tweaks recovery based on interface changes. 2016-08-08 19:19:18 -07:00
James Phillips
1b633e66c5
Moves to a safer design where we don't ingest the initial peers.json file. 2016-08-08 19:19:18 -07:00
James Phillips
aa4e9daf12
Touches up Raft integration after latest changes. 2016-08-08 19:19:18 -07:00
James Phillips
014649abb1
Formats log messages to be consistent. 2016-08-08 19:19:18 -07:00