192 Commits

Author SHA1 Message Date
Seth Vargo
4179aacf11
Add an API method for determining the best status
Given a list of HealthChecks, this determines the "best" status for the
collective group. This is useful for nodes and services, which may have
multiple checks associated with them.
2016-11-29 18:41:46 -05:00
Kyle Havlovitz
338e36cc5d Add logWriter to agent Create() method 2016-11-28 18:36:26 -05:00
Kyle Havlovitz
124f907063 Add monitor http endpoint 2016-11-28 18:36:26 -05:00
Kyle Havlovitz
92ce2c9e39 Use uuids in persist temp files to avoid race (#2494) 2016-11-09 15:22:53 -08:00
Kyle Havlovitz
b1760b223e Improve logging when deregistering a nonexistent service (#2492)
Log a warning instead of a success message when attempting to deregister a nonexistent service. In Consul 0.8 this can be changed to giving an error outright, but for now we can keep the idempotent delete behavior.
2016-11-09 16:56:54 -05:00
Kyle Havlovitz
83d2f36b54 Merge pull request #2480 from hashicorp/b-atomic-writes
Atomic writes for persisting service/check state
2016-11-07 15:36:35 -05:00
Kyle Havlovitz
7a3e0f8275
Add a note about not calling sync for persistCheckState 2016-11-07 15:24:31 -05:00
Kyle Havlovitz
e30b289c6f
Call fsync() for saving check/service state 2016-11-07 13:51:03 -05:00
Kyle McCullough
73b281a27c Add setting to skip ssl certificate verification for HTTP checks (#1984)
* http check: add setting to skip ssl certificate verification

* update http check documentation

* fix typo in documentation

* Add TLSSkipVerify to agent api
2016-11-03 13:17:30 -07:00
James Phillips
233a3a101b Supports WAN and LAN Serf Bind Addresses. (#2468)
* * adding cli config and config file support for specifying the serf wan and lan bind addresses
* updating documentation for serf wan and lan options
Fixes #2007

* Cleans up some small things from #2380.

* Uses the bind default for the agent test for Serf WAN and LAN.
2016-11-03 12:58:58 -07:00
James Phillips
c01a3871c9 Adds support for snapshots and restores. (#2396)
* Updates Raft library to get new snapshot/restore API.

* Basic backup and restore working, but need some cleanup.

* Breaks out a snapshot module and adds a SHA256 integrity check.

* Adds snapshot ACL and fills in some missing comments.

* Require a consistent read for snapshots.

* Make sure snapshot works if ACLs aren't enabled.

* Adds a bit of package documentation.

* Returns an empty response from restore to avoid EOF errors.

* Adds API client support for snapshots.

* Makes internal file names match on-disk file snapshots.

* Adds DC and token coverage for snapshot API test.

* Adds missing documentation.

* Adds a unit test for the snapshot client endpoint.

* Moves the connection pool out of the client for easier testing.

* Fixes an incidental issue in the prepared query unit test.

I realized I had two servers in bootstrap mode so this wasn't a good setup.

* Adds a half close to the TCP stream and fixes panic on error.

* Adds client and endpoint tests for snapshots.

* Moves the pool back into the snapshot RPC client.

* Adds a TLS test and fixes half-closes for TLS connections.

* Tweaks some comments.

* Adds a low-level snapshot test.

This is independent of Consul so we can pull this out into a library
later if we want to.

* Cleans up snapshot and archive and completes archive tests.

* Sends a clear error for snapshot operations in dev mode.

Snapshots require the Raft snapshots to be readable, which isn't supported
in dev mode. Send a clear error instead of a deep-down Raft one.

* Adds docs for the snapshot endpoint.

* Adds a stale mode and index feedback for snapshot saves.

This gives folks a way to extract data even if the cluster has no
leader.

* Changes the internal format of a snapshot from zip to tgz.

* Pulls in Raft fix to cancel inflight before a restore.

* Pulls in new Raft restore interface.

* Adds metadata to snapshot saves and a verify function.

* Adds basic save and restore snapshot CLI commands.

* Gets rid of tarball extensions and adds restore message.

* Fixes an incidental bad link in the KV docs.

* Adds documentation for the snapshot CLI commands.

* Scuttle any request body when a snapshot is saved.

* Fixes archive unit test error message check.

* Allows for nil output writers in snapshot RPC handlers.

* Renames hash list Decode to DecodeAndVerify.

* Closes the client connection for snapshot ops.

* Lowers timeout for restore ops.

* Updates Raft vendor to get new Restore signature and integrates with Consul.

* Bounces the leader's internal state when we do a restore.
2016-10-25 19:20:24 -07:00
James Phillips
03ae813bc7 Merge pull request #2389 from hashicorp/jbs-2019
Lower Service tag DNS warning to DEBUG for #2019
2016-10-24 17:05:02 -07:00
Brian Shumate
74a8fbef06
Lower Service tag DNS warning to DEBUG for #2019 2016-10-05 08:45:01 -04:00
Adam Wolfe Gordon
5ac5a8ccfc agent: Stop reaping child processes (resolves #1988)
The consul docker image now uses dumb-init to reap child processes, so
there's no need to reap them ourselves.
2016-10-04 09:36:41 -06:00
James Phillips
57db4bcce6
Adds performance tuning capability for Raft, detuned defaults, and supplemental docs. 2016-08-24 21:58:37 -07:00
James Phillips
4c7a0ed3b0
Merge branch 'master' into f-deregister-critical 2016-08-16 12:53:21 -07:00
James Phillips
ba60afd5d8
Cleans up based on code review feedback. 2016-08-16 12:52:30 -07:00
James Phillips
fbdd021ab9
Adds an "lan" tagged address so we have a way to get them all.
If we didn't have this, then there would be no way to know the LAN
address if address translation was turned on.
2016-08-16 10:49:03 -07:00
James Phillips
4a3d7db24f
Adds ability to deregister a service based on critical check state longer than a timeout. 2016-08-16 01:00:26 -07:00
James Phillips
d336bdd7b0
Adds basic ACL replication plumbing. 2016-08-03 21:24:04 -07:00
Cameron Davison
d138752249
atomic write service state and checks files, fixes #1221 2016-08-03 10:34:12 -05:00
Sean Chittenden
112f3fd468
Give log reviewers a hint as to which check is failing 2016-06-20 15:25:21 -07:00
Sean Chittenden
63adcbd5ef
Revert "Move structs.CheckID to a new top-level package, types."
This reverts commit 2bbd52e3b44ff1b60939a8400264d534662d6d51.
2016-06-07 16:59:02 -04:00
Sean Chittenden
cbb945e76a
Move structs.CheckID to a new top-level package, types.
Per discussion w/ @slackpad, move this type to its own top-level package
2016-06-07 16:59:02 -04:00
Sean Chittenden
f5ab25163e
Move structs.CheckID to a new top-level package, types.
Per discussion w/ @slackpad, move this type to its own top-level package
2016-06-07 16:59:02 -04:00
Sean Chittenden
ddbe64a8c8
Float a type balloon. Some strings are square pegs in round holes.
This experiment was brought about because of variable naming
confusion where name and checkIDs were interchanged.  Gave CheckID
an Qualified Type Name and chased downstream changes.
2016-06-07 16:59:02 -04:00
James Phillips
7bf684ece1 Fixes some bad error returns in the persist service and check paths. 2016-04-26 15:03:26 -07:00
James Phillips
ceac68c5eb Merge pull request #1762 from mshean/script-timeout
Add Timeout field to CheckMonitor
2016-04-24 23:08:06 -07:00
Matt Shean
fe4107019e add Timeout field to CheckMonitor 2016-04-20 11:41:30 -07:00
James Phillips
eedeba682b Makes reap time configurable for LAN and WAN. 2016-04-11 00:38:25 -07:00
Wim
b5d45322b4 Allow [::] as a bind address (binds to first public IPv6 address) 2016-03-18 23:59:44 +01:00
James Phillips
c60a526fde Sets up config for more address tags down the road, renames struct members. 2016-02-07 10:37:34 -08:00
Evan Gilman
8fa2a60208 Rectify value of AdvertiseAddrWan when set elsewhere
`AdvertiseAddrs` has been introduced as a configuration option, which
duplicates a few other options, namely `AdvertiseAddrWan`. We need to
use this value elsewhere, so rather than doing a precedence check every
time we need to access it, rectify the value of `AdvertiseAddrWan` to
match
2016-02-06 23:01:45 -08:00
Sean Chittenden
7af6a94edb Factor out duplicate functions into a lib package
Consolidate code duplication and tests into a single lib package.  Most of these functions were from various **/util.go functions that couldn't be imported due to cyclic imports.  The consul/lib package is intended to be a terminal node in an import DAG and a place to stash various consul-only helper functions.  Pulled in hashicorp/go-uuid instead of consolidating UUID access.
2016-01-29 16:57:45 -08:00
James Phillips
343838f12b Adds support for the reap lock. 2016-01-12 21:10:25 -08:00
Ryan Uber
afafae53fd consul: dev mode works 2015-12-26 20:19:36 -05:00
James Phillips
95c708f65e Adds Docker checks support to client API.
Also changed `DockerContainerId` to `DockerContainerID`, and updated the agent
API docs to reflect their support for Docker checks.
2015-11-18 07:40:02 -08:00
James Phillips
5e7523ea4b Adds a slightly more flexible mock system so we can test DNS. 2015-11-15 17:06:00 -08:00
James Phillips
c0bd639662 Makes the version upshift code look at the correct version field. 2015-10-27 14:44:34 -07:00
Diptanu Choudhury
471442e9a4 Making an explicit check to test whether a check is of type Monitor 2015-10-26 19:52:32 -07:00
Diptanu Choudhury
423f7fbcac Not adding the docker check if we couldn't create the client 2015-10-26 16:45:12 -07:00
Diptanu Choudhury
5f8f531d2a Defaulting to Monitor check 2015-10-26 15:02:23 -07:00
Diptanu Choudhury
71ede8addb Implemented Docker health checks 2015-10-26 10:23:57 -07:00
James Phillips
660da92152 Makes the default protocol 2 and lets 3 interoperate with 2. 2015-10-23 15:23:01 -07:00
James Phillips
ad65d953f6 Scales coordinate sends to hit a fixed aggregate rate across the cluster. 2015-10-23 15:23:01 -07:00
James Phillips
66a3d29743 Simplifies the batching function and adds some comments. 2015-10-23 15:23:01 -07:00
James Phillips
5f754c4a87 Does some small cleanups based on PR feedback.
* Holds coordinate updates in map and gets rid of the update channel.
* Cleans up config variables a bit.
2015-10-23 15:23:01 -07:00
James Phillips
d12aa2ffab Moves batching down into the state store and changes it to fail-fast.
* A batch of updates is done all in a single transaction.
* We no longer need to get an update to kick things, there's a periodic flush.
* If incoming updates overwhelm the configured flush rate they will be dumped with an error.
2015-10-23 15:23:01 -07:00
James Phillips
b9d5fb0f90 Flips the sense of the coordinate enable option. 2015-10-23 15:23:01 -07:00
James Phillips
86b112fe31 Does a clean up pass on the Consul side. 2015-10-23 15:23:01 -07:00